Skip to main content

JupyterHub setup on Cent OS or Red Hat server

I set up a collaboration tool for a new Data Science team that allows everyone to share their work and use the computational resources efficiently. I'm a Data Scientist, my background is from Computer Engineering and I wear many hats!

It was initially hard for me to know where to start but I got good references from the Internet. Eventually, I ended up with JupyterHub for my team!

I jotted down some notes during my installation and thought to share with others. I hope it may help someone. I installed the JupyterHub initially on Cent OS and we migrated to Red Hat server. The following steps is applicable for both Cent OS and Red Hat.
  1. Check sudo access. You should have sudo access because we are going to serve JupyterHub via sudo user account. 
  2. Download Anacoda distribution that comes with default packages and libraries. Anacond5.2.0-Linux-x86_64.sh is the latest one while I'm writing this blog.
  3. Install the Anaconda at /opt/anaconda3 location. Command to run:
  4. sudo bash Anaconda5.2.0-4.2.0-Linux-x86_64.sh
  5. Install npm, nodejs and configurable-http-proxy.
  6. sudo bash yum install npm nodejs
    sudo npm install -g configurable-http-proxy
  7. Optional: Upgrade JupyterHub if you are reinstalling.
  8. sudo /opt/anaconda3/bin/pip install --upgrade jupyterhub
  9. Install SudoSpawner that enables JupyterHub to spawn single-user servers without being root. Please check SudoSpawner github link for more information.
  10. sudo /opt/anaconda3/bin/pip install git+https://github.com/jupyter/sudospawner
  11. Create sudo user without password to serve the JupyterHub.
  12. sudo useradd rhea
  13. Add SudoSpawner to the sudoers file.
    1. Edit sudoer file,
      sudo visudo
    2. Add below lines to the file
      1. Defaults secure_path=/sbin:/bin:/usr/sbin:/usr/bin:/opt/anaconda3/bin
      2. Cmnd_Alias JUPYTER_CMD=/opt/anaconda3/bin/sudospawner
      3. rhea ALL=(%jupyterhub)NOPASSWD:JUPYTER_CMD
  14. PAM configuration. For more details, please check JupyterHub wiki
    1. sudo groupadd shadow
      sudo chgrp shadow /etc/shadow
      sudo chmod g+r /etc/shadow
      sudo usermod -a -G shadow rhea
      
    2. To run Jupyter Hub in port 80,
    3. sudo setcap 'cap_net_bind_service=+ep' /usr/bin/node
      
    4. Check permissions for non-root users. NOTE: rhea should not be added to wheel group. 
      1. ls -l /etc/shadow
        
        If you don't see read write permission then set one by running
        sudo chmod g+rw /etc/shadow
        
  15. JupyterHub configuration,
    1. Create a separate directory to maintain JupyterHub configuration file, SSL certificate and key file.
      1. sudo mkdir /etc/jupyterhub
        sudo chown rhea /etc/jupyterhub
        cd /etc/jupyterhub/
        
      2. Generate JupyterHub config file, 
      3. sudo -u rhea /opt/anaconda3/bin/jupyterhub --generate-config
        
      4. Generate SSL certificate and key file,
      5. sudo -u rhea openssl req -x509 -nodes -days 365 -newkey rsa:1024 -keyout jhubk.key -out jhubc.pem
        
      6. Edit JupyterHub config file and add the below lines,
        1. c.Authenticator.admin_users = {'admin1','admin2'} # Jupyter Hub administrators username. Only these users can stop and start servers of other users, and JupyterHub server itself from the browser
        2. c.JupyterHub.ip = 'xxx.xx.xx.xx' # Add your server IP address
        3. c.JupyterHub.port = 8888 # Port number you want to server JupyterHub
        4. c.JupyterHub.ssl_cert = '/etc/jupyterhub/jhubc.pem'
        5. c.JupyterHub.ssl_key = '/etc/jupyterhub/jhubk.key'
        6. c.JupyterHub.spawner_class = 'sudospawner.SudoSpawner'
  16. Create JupyterHub users account. For example, to create admin1 account,
    1. sudo adduser admin1
      sudo passwd admin1
      
    2. If it is an administrator account, add it to wheel,
    3. sudo usermod -aG wheel admin1
      
  17. Create JupyterHub users group and add users to this group. Users who have account in the server level cannot access JupyterHub unless s/he is added to this users group.
    1. Create JupyterHub users group
    2. sudo groupadd jupyterhub
      
    3. To add "admin1" user to this group
    4. sudo usermod -a -G jupyterhub admin1
      
  18. Test sudo account rhea set up,
    1. sudo -u rhea sudo -n -u $USER sudospawner --help
      
    2. sudo -u rhea sudo -n -u $USER echo 'fail'
      
    3. The above command should say something like
      sudo: a password is required
      
  19. Check if PAM is working,
    1. sudo -u rhea /opt/anaconda3/bin/python -c "import pamela, getpass; print(pamela.authenticate('$USER', getpass.getpass()))"
      
    2. The above command will ask you to provide password. You will get "None" once you entered your system password
  20. Open the port you entered in the JupyterHub config file if you haven't done previously
Let's serve the JupyterHub!!!
Run the below command to start the JupyterHub
sudo -u rhea /opt/anaconda3/bin/jupyterhub -f /etc/jupyterhub/jupyterhub_config.py

To run the start command as a background process, use below command.
nohup sudo -u rhea /opt/anaconda3/bin/jupyterhub -f jupyterhub_config.py &

Hope this helps. Please provide your comments if any.

Error and work around,

1. PAM session error. This error was not allowing to restart the JupyterHub. The work around was discussed in this github issue
Solution
Edit the PAM file, sudo vi /etc/pam.d/login . Comment the line if it is not commented already, #session required pam_loginuid.so

2. If you get, failed to create PAM sessions error, github issue
Solution
Edit your JupyterHub config file and set, c.PAMAuthenticator.open_sessions = False

3. If you see, Proxy appears to be running at [] but I can't access it (HTTP 403: Forbidden)
Solution:
Kill the proxy that is running currently, ps ax | grep proxy and kill the process

4. When you get 500 Page error not found
Solution:
Check "rhea" account if it is added to wheel. If so, remove rhea user account from the wheel.

Comments

  1. Las Vegas Hotel & Casino - JAMH Hub
    The Las Vegas Convention & Visitors 서귀포 출장안마 Bureau is pleased to announce the opening 양주 출장샵 of The Casino 화성 출장마사지 at Wynn Hotel and Casino 삼척 출장샵 in Las 경상북도 출장안마 Vegas, NV.

    ReplyDelete
  2. The course of at each playing website is a little different, but it typically doesn’t take greater than 5 minutes. Below are some simply ideas to help you|that will assist you|that can 점보카지노 help you} along as you study the game of Video Poker. Remember, like each different on line casino card sport, apply makes good. We can draw this by matching two of the three hold playing cards with the draw. After this, there are six remaing playing cards that will lead to Two Pair. There is not any probability of drawing to a Full House with this hold, however there is a probability of obtaining Three of a Kind.

    ReplyDelete

Post a Comment

Thank you for sharing your comment!. Have good day ahead.

Popular posts from this blog

Real-time Stream Processing and Analytics in Large Scale Using Apache NiFi, HDFS, Hive and Power BI

Twitter’s developer platform provides numerous API endpoints to collect data, and build app on Twitter. Twitter streaming allows us to collect live tweets. In this blog, I show you how I used Twitter Streaming data to build interactive dashboards. I used Apache nifi, Power BI and Hive in this work. The tweets are filtered based on certain key words and geo location. You can find the Apache nifi template I built for this work from my Github repo . The nifi template has key words and geo-locations which are differ from what I used in my work. Apachi nifi – to collect tweets from Twitter stream, doing data transformation and routing the collected data to different systems such as Power BI and Hive database Power BI Streaming Dataset - Power BI has streaming dataset. I created streaming dataset and did the data ingestion from Nifi through Streaming dataset API. Power BI dashboard is built based on the streaming dataset. To know about how to create streaming dataset. Please check thi