Taming Big Data! презентация
Содержание
- 2. Discovery is an iterative process
- 3. Discovery in the big data era: Resource-intensive, expensive, slow
- 4. Three big data challenges Channel massive flows Automate management Build discovery
- 5. Three big data challenges Channel massive flows Automate management Build discovery
- 6. Channel massive data flows Data must move to be useful. We
- 7. Transfer is challenging at many levels Speed and reliability GridFTP protocol
- 10. GridFTP protocol and implementations: Fast, reliable, secure 3rd-party data transfer
- 11. 85 Gbps sustained disk-to-disk over 100 Gbps network, Ottawa—New Orleans
- 17. Transfer scheduling and optimization Science data traffic is extremely bursty User
- 18. A load-aware, adaptive algorithm: (1) Data-driven model of throughput
- 19. A load-aware, adaptive algorithm: (2) Concurrency-constrained scheduling Define transfer priority: Schedule
- 22. Robust analytic models for science at extreme scales Gagan Agarwal1* Prasanna
- 23. How to create more accurate, useful, and portable models of distributed
- 24. Differential regression for combining data from different sources Example of use:
- 25. End-to-end profile composition
- 26. Three big data challenges Channel massive flows Automate management Build discovery
- 28. One researcher’s perspective on data management challenges
- 30. Tripit exemplifies process automation Me Book flights Book hotel
- 31. How the “business cloud” works
- 32. Process automation for science
- 33. Globus research data management services
- 34. Reliable, secure, high-performance file transfer and synchronization “Fire-and-forget” transfers Automatic fault
- 35. Simple, secure sharing off existing storage systems
- 36. Extreme ease of use InCommon, Oauth, OpenID, X.509, … Credential management
- 39. High-speed transfers to/from AWS cloud, via Globus transfer service UChicago
- 40. Globus transfer & sharing; identity & group management, data discovery &
- 41. Globus under the covers
- 42. Globus under the covers
- 44. Globus Platform-as-a-Service
- 45. The Globus Galaxies platform: Science as a service
- 46. Three big data challenges Channel massive flows Automate management Build discovery
- 47. Discovery engines: Integrate simulation, experiment, and informatics
- 48. A discovery engine for metagenomics
- 50. DOE Systems Biology Knowledge Base (KBase)
- 52. A discovery engine for the study of disordered structures
- 53. Immediate assessment of alignment quality in near-field high-energy diffraction microscopy
- 54. New data, computational capabilities, and methods create opportunities and challenges
- 55. Big Data to Knowledge: bd2k.org
- 56. Three big data challenges Channel massive flows New protocols and management
- 57. My work is supported by:
- 58. Thank you! foster@anl.gov ianfoster.org
- 59. Скачать презентацию
Слайды и текст этой презентации