Big Data Platform at interest презентация
Содержание
- 4. Data at Pinterest
- 5. Pinterest Data Architecture
- 6. Pinterest Data Architecture
- 7. Pinterest Data Architecture
- 8. Pinterest Data Architecture
- 10. Hadoop Platform Requirements Ephemeral clusters Access control layer Shared data store
- 11. Decoupling compute & storage
- 12. Centralized Hive Metastore
- 13. Multi-layered Packaging
- 14. Executor Abstraction Layer
- 15. Why Qubole? API for simplified executor abstraction Advanced support for spot
- 16. Pinball for Workflow Management
- 17. Scale of Processing Scale: 60 Billion Pins Hundreds of workflows Thousands
- 18. Why Pinball? Requirements Simple abstractions Extensible in future Reliable stateless computing
- 19. Pinball Design
- 20. Workflow Model Workflow A directed graph of nodes called
- 21. Job State Job state is captured in a token Tokens are
- 22. Job State Machine
- 23. Master Worker Interaction Master keeps the state Workers claim and execute
- 24. Master Entire state is kept in memory Each state update is
- 25. Worker
- 26. Open Source Git repo: https://github.com/pinterest/pinball Mailing list: https://groups.google.com/forum/#!forum/pinball-users
- 27. Thank You
- 28. Скачать презентацию
Слайды и текст этой презентации