Pages

Monday, March 06, 2023

Blog Reboot (maybe) and What’s Next

It has been almost a decade since I updated this blog. Perusing the previous blog posts, it is interesting to see evolution of my professional interests from Data storage and Clouds to FinTech and Startups. This blog appeared to had become an exploration of random ideas and interests while I was in between larger projects.

Start and End of PeerCube

My previous interest in crowdfunding and analyzing peer to peer lending data evolved into PeerCube, a startup helping retail lenders, and a few institutional lenders, make better lending decisions. While writing was on the wall for a while, once Lending Club decided to discontinue retail lending platform, there was no path forward for PeerCube to continue. As a bootstrapped startup focused on retail lenders and being very efficient with our limited resources, we also couldn't figure out a pivot to continue operations. Considering over the years, we were pitched several competitors for potential acquisition, we feel PeerCube did quite well. But, it was time for us to start next chapter in our lives.

Podcast 19: Anil Gupta of PeerCube on P2P Lending Analysis

During the existence of PeerCube, we captured and collected lot of lending data, specifically from Lending Club's primary and secondary platforms that we felt was relevant for broader consumer lending industry. There were also some good posts shared on PeerCube blog based on analysis of lending data, for example changes in FICO score during the life of loans and how it impacted the final outcome of the loans. Unfortunately, they all have lost any relevance now and are no longer available online.

What's Next

During the past few years, I have explored and learnt about several random areas, such as Autonomous driving, Satellite imaging, Text mining, Options trading, Web development, in addition to moving to Japan due to family reasons and learning Japanese. I continue to be interested in AI/ML/DS applications and data-intensive projects.

While I am not planning to restart this blog again, going forward I will use it as personal repository of my notes, most likely in unfinished, incomplete, and unpolished state.

Tuesday, February 25, 2014

Building a Python/Django Development Virtual Machine

Recently for a project, I needed to build a Python, Django, PostgreSQL, NGINX Development Virtual Machine(VM). Below are the steps that I followed to build this VM in VMware Fusion on MacbookPro (MBP). This post is as much about sharing my build experience as documenting the steps for my future use and potential automation.

The guidance for this procedure came from How To Install and Configure Django with Postgres, Nginx, and Gunicorn.

Installation

Ubuntu Server OS

The steps were very similar to the ones I covered in my prior post OpenStack: Quick Install using DevStack.

Download Ubuntu 12.04 "Precise Pangolin" x86_64 Minimal CD ISO Image mini.iso.

Start VMware Fusion and select Virtual Machine Library in Windows option on VMware Fusion toolbar. This will bring up Virtual Machine Library window showing all the Virtual Machine already available.


Virtual Machine Library
Click Add button and select New. This will bring up New Virtual Assistance Window showing Create New Virtual Machine.

Click Continue without disc as we will be using the downloaded ISO image. This will bring up Installation Media section.
Select Use operating system installation disc or image and click on arrows next to Choose a disc or disc image.... Select the Ubuntu ISO image and then click Continue.

Choose Operating System section should show Linux as Operating System and Ubuntu 64-bit as Version. Click Continue.

The Finish section will show Virtual Machine Summary, Click Finish. Select the location where we want to save the VM file and name the file.

A console window will be launched and OS install will start. Answered the prompts during the install process. Once update and reboot completes, a login prompt will appear. Log in to VM.


VM Login Screen

OpenSSH Server

After login, install OpenSSH Server to enable access to Ubuntu VM over SSH.

$sudo apt-get install openssh-server

Check whether SSH process is running.

$service ssh status

Either note down the IP address of VM from login screen (shown above) or using ifconfig command to be able to SSH into the VM remotely.

$ssh anil@172.16.191.158

You may need to remove SSH key if there is a fingerprint mismatch between the VM and remote client.

$ssh-keygen -R 172.16.191.158

Update Packages

You need to make sure all installed packages are current. Download any package updates and install.

anil@django:~$ sudo apt-get update
[sudo] password for anil: 
Hit http://us.archive.ubuntu.com precise Release.gpg
Get:1 http://us.archive.ubuntu.com precise-updates Release.gpg [198 B]
...

At this point VM is ready for installation of Python Virtualenv, Django, PostgreSQL, NGINX, and Gunicorn.

Python Virtualenv

Virtualenv is Virtual Python Environment builder to create separate Python environments. This enables to keep installations, dependencies, versions and permissions separate for different applications across different virtual environments.

Install python-virtualenv.

anil@django:~$ sudo apt-get install python-virtualenv
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following extra packages will be installed:
  python-pip python-setuptools
The following NEW packages will be installed:
  python-pip python-setuptools python-virtualenv
...

Now, we need to create a virtual environment for our project (in this case, lendcafe) where we can install Python and Django packages.

anil@django:~$ sudo virtualenv /opt/lendcafe
New python executable in /opt/lendcafe/bin/python
Installing distribute...........................................................done.
Installing pip...............done.

You can name anything you like for your virtualenv.

Django

Django is a Python web framework. It enables rapid development of common web application tasks and adheres to DRY principle (Don't Repeat Yourself).

To install Django, first we need to activate virtualenv.

anil@django:~$ source /opt/lendcafe/bin/activate
(lendcafe)anil@django:~$ 

The activate script modifies shell prompt to show the currently active environment.

Install Django

(lendcafe)anil@django:~$ sudo pip install django
[sudo] password for anil: 
Downloading/unpacking django
  Downloading Django-1.6.2.tar.gz (6.6Mb): 6.6Mb downloaded
  Running setup.py egg_info for package django

    warning: no previously-included files matching '__pycache__' found under directory '*'
    warning: no previously-included files matching '*.py[co]' found under directory '*'
Installing collected packages: django
  Running setup.py install for django
    changing mode of build/scripts-2.7/django-admin.py from 644 to 755

    warning: no previously-included files matching '__pycache__' found under directory '*'
    warning: no previously-included files matching '*.py[co]' found under directory '*'
    changing mode of /usr/local/bin/django-admin.py to 755
Successfully installed django
Cleaning up...

PostgreSQL

PostgreSQL is an open source object-relational database.

Deactivate the virtual environment.

(lendcafe)anil@django:~$ deactivate
anil@django:~$ 

Install Python dependencies for PostgreSQL

anil@django:~$ sudo apt-get install libpq-dev python-dev
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following extra packages will be installed:
...

Install PostgreSQL

anil@django:~$ sudo apt-get install postgresql postgresql-contrib
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following extra packages will be installed:
...

NGINX

NGINX is an open source HTTP server and reverse proxy. It is known for high performance and low resource utilization. Instead of relying on threads to handle requests, it uses event-driven asynchronous architecture.

Install NGINX

anil@django:~$ sudo apt-get install nginx
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following extra packages will be installed:
...

Gunicorn

Gunicorn is a Python WSGI HTTP Server.

Activate virtualenv

anil@django:~$ source /opt/lendcafe/bin/activate
(lendcafe)anil@django:~$ 

Install Gunicorn within virtualenv

(lendcafe)anil@django:~$pip install gunicorn
Downloading/unpacking gunicorn
  Downloading gunicorn-18.0.tar.gz (366Kb): 366Kb downloaded
  Running setup.py egg_info for package gunicorn
  ...

Configuration

PostgreSQL

The default superuser for PostgreSQL is called postgres. We need to login as this user first.

anil@django:~$ sudo su - postgres
[sudo] password for anil: 
postgres@django:~$ 

The shell prompt should now show starting with postgres@....

Create a user. Answer the prompts. I decided to not make this user superuser, allow this new user to create databases or new roles.

postgres@django:~$ createuser -P
Enter name of role to add: anil
Enter password for new role: 
Enter it again: 
Shall the new role be a superuser? (y/n) n
Shall the new role be allowed to create databases? (y/n) n
Shall the new role be allowed to create more new roles? (y/n) n

Create a new database. I named this database to be lendcafe.

postgres@django:~$ createdb lendcafe

To grant new user access to this database, first access the PostgreSQL interactive terminal and then grant all privileges. Type \q to quit.

postgres@django:~$ psql
psql (9.1.11)
Type "help" for help.
postgres=# GRANT ALL PRIVILEGES ON DATABASE lendcafe TO anil;
GRANT
postgres=# \q
postgres@django:~$ 

Test the new user log in to database. If you are not already logged in the system as new user, you will get error as shown below. Login as new user or use su command.

postgres@django:~$ psql -d lendcafe -U anil
psql: FATAL:  Peer authentication failed for user "anil"
postgres@django:~$ su - anil
Password: 
anil@django:~$ psql -d lendcafe -U anil
psql (9.1.11)
Type "help" for help.
lendcafe=> 
lendcafe=> \q
anil@django:~$ 

Django

To create a Django project, first switch to the virtualenv directory created during installation and activate the virtualenv.

anil@django:~$ cd /opt/lendcafe
anil@django:/opt/lendcafe$ source /opt/lendcafe/bin/activate
(lendcafe)anil@django:/opt/lendcafe$    

Start a new Django project. If you receive permission denied error as shown below, change the ownership of your environment directory. Then start a new Django project again. Check to make sure the project directory was created in virtualenv directory.

(lendcafe)anil@django:/opt/lendcafe$ django-admin.py startproject lcproject
CommandError: [Errno 13] Permission denied: '/opt/lendcafe/lcproject'
(lendcafe)anil@django:/opt/lendcafe$ sudo chown -R anil:anil /opt/lendcafe
[sudo] password for anil: 
(lendcafe)anil@django:/opt/lendcafe$ django-admin.py startproject lcproject
(lendcafe)anil@django:/opt/lendcafe$ ls -la
total 28
drwxr-xr-x 7 anil anil 4096 Feb 21 11:52 .
drwxr-xr-x 3 root root 4096 Feb 20 10:37 ..
drwxr-xr-x 2 anil anil 4096 Feb 20 10:37 bin
drwxr-xr-x 2 anil anil 4096 Feb 20 10:37 include
drwxrwxr-x 3 anil anil 4096 Feb 21 11:52 lcproject
drwxr-xr-x 3 anil anil 4096 Feb 20 10:37 lib
drwxr-xr-x 2 anil anil 4096 Feb 20 10:37 local
(lendcafe)anil@django:/opt/lendcafe$ 

For Django to be able to communicate with PosgreSQL database, we need to install PostgresSQL adapter for the Python Psycopg.

(lendcafe)anil@django:/opt/lendcafe$ pip install psycopg2
Downloading/unpacking psycopg2
  Downloading psycopg2-2.5.2.tar.gz (685Kb): 685Kb downloaded
  Running setup.py egg_info for package psycopg2

Installing collected packages: psycopg2
...
Successfully installed psycopg2
Cleaning up...
(lendcafe)anil@django:/opt/lendcafe$ 

Edit the setting.py file in directory and subdirectory named same as your project name.

(lendcafe)anil@django:/opt/lendcafe$ nano lcproject/lcproject/settings.py

DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.postgresql_psycopg2',
        'NAME': 'lendcafe',
        'USER': 'anil',
        'PASSWORD': 'myPassword',
        'HOST': 'localhost',
        'PORT': '',
    }
}

Run the following command to add Django specific database tables and configuration to database. I received the following error message when I tried to run the command.

(lendcafe)anil@django:/opt/lendcafe$ python lcproject/manage.py syncdb
Traceback (most recent call last):
  File "lcproject/manage.py", line 8, in <module>
    from django.core.management import execute_from_command_line
ImportError: No module named django.core.management

The StackOverflow discussion django import error - No module named core.management provides potential solutions to this error.

My issue turned out that I originally installed Django as root user while I was trying to run the above command as different user. When I ran the above command with sudo, I encountered a different error.

(lendcafe)anil@django:/opt/lendcafe/lcproject$ sudo python manage.py syncdb
Traceback (most recent call last):
  File "manage.py", line 11, in <module>
...
  File "/usr/local/lib/python2.7/dist-packages/django/db/backends/postgresql_psycopg2/base.py", line 25, in <module>
raise ImproperlyConfigured("Error loading psycopg2 module: %s" % e)
django.core.exceptions.ImproperlyConfigured: Error loading psycopg2 module: No module named psycopg2

At this point, I decided to reinstall Django, this time without sudo.

(lendcafe)anil@django:/$ pip install django
Downloading/unpacking django
  Downloading Django-1.6.2.tar.gz (6.6Mb): 6.6Mb downloaded
  Running setup.py egg_info for package django

    warning: no previously-included files matching '__pycache__' found under directory '*'
    warning: no previously-included files matching '*.py[co]' found under directory '*'
Installing collected packages: django
  Running setup.py install for django
    changing mode of build/scripts-2.7/django-admin.py from 664 to 775

    warning: no previously-included files matching '__pycache__' found under directory '*'
    warning: no previously-included files matching '*.py[co]' found under directory '*'
    changing mode of /opt/lendcafe/bin/django-admin.py to 775
Successfully installed django
Cleaning up...

Once I reinstalled Django, I was able to successfully add Django specific database tables.

(lendcafe)anil@django:/opt/lendcafe/lcproject$ python manage.py syncdb
Creating tables ...
Creating table django_admin_log
Creating table auth_permission
Creating table auth_group_permissions
Creating table auth_group
Creating table auth_user_groups
Creating table auth_user_user_permissions
Creating table auth_user
Creating table django_content_type
Creating table django_session

You just installed Django's auth system, which means you don't have any superusers defined.
Would you like to create one now? (yes/no): yes
Username (leave blank to use 'anil'): 
Email address: xyz@example.com
Password: 
Password (again): 
Superuser created successfully.
Installing custom SQL ...
Installing indexes ...
Installed 0 object(s) from 0 fixture(s)
(lendcafe)anil@django:/opt/lendcafe/lcproject$ 

Gunicorn

Create a gunicorn_config.py file in /opt/lendcafe directory and add following entries in the file.

(lendcafe)anil@django:/opt/lendcafe$ nano gunicorn_config.py

command = '/opt/lendcafe/bin/gunicorn'
pythonpath = '/opt/lendcafe/lcproject'
bind = '127.0.0.1:8001'
workers = 5
user = 'anil'

Use the following command to run the server.

(lendcafe)anil@django:/opt/lendcafe$ /opt/lendcafe/bin/gunicorn -c /opt/lendcafe/gunicorn_config.py lcproject.wsgi
2014-02-21 15:41:13 [2892] [INFO] Starting gunicorn 18.0
2014-02-21 15:41:13 [2892] [INFO] Listening at: http://127.0.0.1:8001 (2892)
2014-02-21 15:41:13 [2892] [INFO] Using worker: sync
2014-02-21 15:41:13 [2897] [INFO] Booting worker with pid: 2897
2014-02-21 15:41:13 [2898] [INFO] Booting worker with pid: 2898
2014-02-21 15:41:13 [2899] [INFO] Booting worker with pid: 2899
2014-02-21 15:41:13 [2900] [INFO] Booting worker with pid: 2900
2014-02-21 15:41:13 [2901] [INFO] Booting worker with pid: 2901

Background the process by using ctrl + z and then typing bg followed by Enter. supervisord and screen can be used to manage Gunicorn start/restart. Refer to How to Install and Manage Supervisor on Ubuntu and Debian VPS for more information.

Enter the IP address and port specified in gunicorn_config.py file in a browser's address bar. If you see the message It worked! Congratulations on your first Django-powered page., you are good to go.


Welcome to Django GUI

NGINX

Django settings for the project are stored in settings.py file. As NGINX will be used for static files, modify the location of STATIC_URL in /opt/lendcafe/lcproject/lcproject/settings.py file.

(lendcafe)anil@django:/opt/lendcafe/lcproject/lcproject$ nano settings.py

# Static files (CSS, JavaScript, Images)
# https://docs.djangoproject.com/en/1.6/howto/static-files/

STATIC_URL = '/opt/lendcafe/static/'

Create a subdirectory static in /opt/lendcafe/ directory.

(lendcafe)anil@django:/opt/lendcafe$ mkdir static
(lendcafe)anil@django:/opt/lendcafe$ ls -la
total 36
drwxr-xr-x 8 anil anil 4096 Feb 22 15:26 .
drwxr-xr-x 3 root root 4096 Feb 20 10:37 ..
drwxr-xr-x 2 anil anil 4096 Feb 21 15:28 bin
-rw-rw-r-- 1 anil anil  133 Feb 21 15:41 gunicorn_config.py
drwxr-xr-x 2 anil anil 4096 Feb 20 10:37 include
drwxrwxr-x 3 anil anil 4096 Feb 21 11:52 lcproject
drwxr-xr-x 3 anil anil 4096 Feb 20 10:37 lib
drwxr-xr-x 2 anil anil 4096 Feb 20 10:37 local
drwxrwxr-x 2 anil anil 4096 Feb 22 15:26 static
(lendcafe)anil@django:/opt/lendcafe$ 

NGINX is a reverse proxy and HTTP server. There are two blocks, Server-block and Location-blocks in configuration file that we will be primarily working with. The server-block is very similar to virtual host and location-block to URI. There is much more information available in NGINX Beginner's Guide.

Create a new NGINX config file lendcafe in /etc/nginx/sites-available/ directory and enter the following configuration information.

(lendcafe)anil@django:/opt/lendcafe$ sudo nano /etc/nginx/sites-available/lendcafe

server {
    server_name localhost;

    location /static/ {
        alias /opt/lendcafe/static/;
    }

    location / {
        proxy_pass http://127.0.0.1:8001;
        proxy_set_header X-Forwarded-Host $server_name;
        proxy_set_header X-Real-IP $remote_addr;
        add_header P3P 'CP="ALL DSP COR PSAa PSDa OUR NOR ONL UNI COM NAV"';
    }
}

Create a symbolic link in /etc/nginx/sites-enabled directory to the NGINX configuration file lendcafe.

(lendcafe)anil@django:/etc/nginx/sites-enabled$ sudo ln -s /etc/nginx/sites-available/lendcafe

Delete the symbolic link default.

(lendcafe)anil@django:/etc/nginx/sites-enabled$ sudo rm default

Restart NGINX service to changes to take effect.

(lendcafe)anil@django:/etc/nginx/sites-enabled$ sudo service nginx restart
Restarting nginx: nginx.
(lendcafe)anil@django:/etc/nginx/sites-enabled$ 

VM is setup for Python/Django web development.

Tuesday, February 11, 2014

OpenStack: Virtual Image Instances using Horizon Dashboard

Install Addendum


Enable VT in BIOS

An addendum to install steps defined in my previous post OpenStack: Quick Install using DevStack is required to avoid a surprise that I encountered after the install. Please check to make sure BIOS is at latest version available from the system manufacturer and Intel's Virtualization Technology (VT) is enabled in BIOS.

anil@OSCloud:~$ sudo apt-get install cpu-checker
anil@OSCloud:~$ sudo kvm-ok
INFO: /dev/kvm exists
KVM acceleration can be used

If CPU doesn't support VT, the output will show CPU does not support KVM extensions.
The OpenStack Horizon Dashboard is implemented as a Python/Django web application that provides admin and user interface to OpenStack services.

Horizon Dashboard


Log in

In web browser, type the IP address for the dashboard. On Log In page enter User Name and Password and click Sign In. When signing in as Admin, the home page shows the Admin panel - System Panel - Overview.

Horizon Dashboard Admin Home Page


Existing Virtual Machine Images

By clicking Images category in Admin - System panel on the left, a list of available images are viewed. In default installation, CirrOS x86_64 image is made available in AMI/ARI/AKI format.

CirrOS images are tiny cloud guest images with minimal Linux distribution that can also be downloaded from LaunchPad. The AMI/ARI/AKI is the image format supported by Amazon EC2. AMI (Amazon Machine Image) is a virtual machine raw image. ARI (Amazon Kernel Image) is a kernel file (vmlinuz) that will load initially to boot image. ARI (Amazon Ramdisk Image) is ramdisk file (initrd) mounted at boot time.

Horizon Dashboard Admin Images Page


Launch Instances

Clicking on the Project tab in left panel shows the overview of current project.

Horizon Dashboard Project Home Page

To launch an instance from an image, click Images and Snapshot category in Project - Manage Compute panel on the left.

Horizon Dashboard Project Images & Snapshot Page

Select an image and click Launch. A Launch Instance modal pop-up appear. Enter a name in Instance Name field in Details tab.

Horizon Launch Instance Details Popup

In Access & Security tab, enter a passphrase in Admin Pass and Confirm Admin Pass fields.

Horizon Launch Instance Access & Security Popup

Upon clicking Launch, Horizon dashboard switches to Project - Manage Compute - Instances page and shows the Instances running.

Horizon Project Instances

Clicking on Instance Name hyperlink shows the Instance Details for that specific instance with three tabs for Overview, Log and Console.

Horizon Project Instance Console

Though the Project - Manage Compute - Instances page shows instance to be Active and Running, the console for the instance is displaying an error message.

This kernel requires an x86-64 CPU, but only detected an i686 CPU.
Unable to boot - please use a kernel appropriate for your CPU.


Error Troubleshooting

A little bit of googling suggested to check whether the 64-bit PC (amd64, x86_64) or 32-bit PC (x86) version of host operating system is installed. Sure enough, the Ubuntu version installed on OSCloud host is x86 and not x86-64 version. I can't use x86-64 instance images on OSCloud host.

anil@OSCloud:~$ uname -a
Linux OSCloud 3.2.0-58-generic-pae #88-Ubuntu SMP Tue Dec 3 18:00:02 UTC 2013 i686 i686 i386 GNU/Linux

After terminating the newly created instance test1 and deleting all x86_64 Images, the next step was to either find or build x86 images and start a new x86 instance.

Prebuilt Virtual Machine Images

As OSCloud host is using QEMU Hypervisor, it made sense to look for qcow2 (QEMU copy-on-write) format x86 images. At CirrOS download page, I found a bootable qcow disk image for i386 and decided to try it out.

Create Images

To create images, on Admin tab, select Images and then click Create Image button in right pane. On Create An Image page, enter Name for the image, select Image Source, Image Location, and Format. Select the Public checkbox to make available this image to everyone. Then click Create Image. The image will be queued for creation.

Horizon Admin Create An Image

Once images are created, they will be available to launch instances in projects following the steps listed above in Launch Instances section.

Horizon Admin Images

Horizon Project Launch Instance

In next blog post, I will start to dig deeper into high level solution design using OpenStack. Your feedback and comments are welcome.