Refer to the blog post from Zonca:
https://zonca.dev/2015/09/ipython-jupyter-notebook-nersc-edison.html We realized that users of Cori's predecessor system (Edison) wanted to use Jupyter and were going to go through a lot of hoops to get it. We knew that Jupyter was a big deal (say the usual stuff about it) and so we decided we would embrace our Jupyter users. Then talk about Jupyter as a science gateway, what this gave our users.
Various things, mostly in chronological order:
Then we deployed Cori and part of the plan was to have Jupyter on Cori, so we were able to write a custom spawner and use GSISSH to make it work, blah blah. => Customization
Integration into Spin, being able to manage Jupyter without k8s but still take advantage of Docker containers => Customization and not being "locked in" (e.g to k8s)
Next milestone was more nodes and MFA => Customization
Services => Another abstraction/customization we like and take advantage of
Binder for HPC.
Customizing environments for users and collaborations, providing a "base" environment and kernels that users can use but also giving locus of control (like using this phrase) to users.
This is an R&D project but also an active, officially supported core service, but we've been able to reconcile that through careful messaging and the willingness of our users to provide feedback and allow experimentation on our part, because in the end they're going to get something out of it.
Conclusion
[ROUGH] Jupyter in HPC is now commonplace. We have been able to give hundreds of HPC users a rich user interface to HPC through Jupyter. In the supercomputing context, we look at Jupyter as a tool that will help make it easier for our users to take advantage of supercomputing hardware and software. Some of that will come from us at supercomputing centers. Jupyter as a project needs to not make design decisions that break things for us, or lock us into one way of doing things. Each HPC center is different and that means that for Jupyter to remain useful to HPC centers and supercomputing it needs to maintain its high level of abstraction. We should make this into a bulleted list of demands :)
- Rich user interfaces like Jupyter have the potential to make interacting with a supercomputer still easier, attracting new kinds of users and helping to expand the application of supercomputing to new science domains.
- Supercomputers and the HPC centers that maintain them are not all alike, and while this poses a challenge to those of us who would like to expand access to supercomputing through Jupyter, the challenge is not insurmountable provided the following conditions hold.
- Random thought: Even if vendors begin shipping supercomputer systems with Jupyter inside, HPC centers have to keep up with the demands of their users, and these requirements can evolve faster than the hardware/software cycle of supercomputing vendors. That means that staff should probably do more than just turn on Jupyter and walk away.
- Also we should make sure to mention that as the ultimate rich user interface to supercomputing, Jupyter shows a lot of promise but is not there yet. The things that we need to realize that promise:
- That the Jupyter Project not decide to go in any particular direction that breaks Jupyter or the Jupyter ecosystem for us.
- That the Jupyter Project maintains abstraction as a core design value
- That HPC centers prioritize software development and contributions to open-source projects like the Jupyter Project
- The next step is to really focus on supporting users with scaling needs. We've scaled in terms of the number of users, be we want to enable those users among them who want to use Jupyter to really make the most of supercomputing. What we need to do that also some other tools like Dask, Spark, etc to work as seamlessly as possible with Jupyter.
Acknowledgments
This work was supported by Lawrence Berkeley National Laboratory, through the U.S. Department of Energy under Contract No. DE-AC02-05CH11231. This work used resources of the National Energy Research Scientific Computing Center (NERSC), a U.S. Department of Energy Office of Science User Facility located at Lawrence Berkeley National Laboratory.