Share the Knowledge
RSS icon Home icon
  • Converting YouTube videos to MP3

    Posted on July 4th, 2014 webmaster No comments         

    I was looking for a new Django project to start a few weeks back and I wanted it to be something I could use myself and involves running tasks in the background as I wanted to play around with Celery.  I usually use YouTube nowadays to listen to music while I’m working so I figured it might be a good idea to write a web app where a user can enter the video’s URL and it will extract the audio and convert it to MP3 format.  I thought this was a perfect project to do next, as the conversion process is perfect to run in the background and I could actually use this app once finished.  I don’t have a data plan so it would be really nice to download some stuff from YouTube that I could listen to on my commute to work or when taking my daily walks.

    Luckily, someone already built the harder part of downloading the videos and doing the conversion.  There’s this open source project called youtube-dl that’s actually written in Python, so even better.  All I have to do now is create the frontend and the Celery tasks to run the processes in the background.

    For server hosting, I used DigitalOcean as I find them to be a great value for the money: $5 a month for a VPS with 1 vCPU, 512MB of RAM, 20GB disk space.  I don’t really need more than the 20GB of disk space as I’m not planning on storing every download, I could just run a scheduled job (which Celery has a module for called ‘celerybeat’, which is basically cron) to clean up the files.

    I used an existing Bootstrap template that I used for another project, which is very simple but does the job.  The final product ended up looking like this:


    Because of the simple UI, the app works great on tablets and phones as well.  I currently set the limit for the maximum video length to 3 hours, which should be more than enough for most videos people would want to convert.

    Check it out at:

  • Ansible Playbook for a Django Stack (Nginx, Gunicorn, PostgreSQL, Memcached, Virtualenv, Supervisor)

    Posted on April 20th, 2014 webmaster No comments         

    I decided to create a separate GitHub project for the Ansible playbook I’m currently using to fully provision a production server for my open-source app, GlucoseTracker, so it can be reused by people who are using the same stack.

    You can download the playbook here:

    The playbook can fully provision an Ubuntu 12.04 LTS server (will test 14.04 soon) from the base image with the following applications, which are quite popular in the Django community:

    • Nginx
    • Gunicorn
    • PostgreSQL
    • Memcached
    • Virtualenv
    • Supervisor

    I used this awesome guide to set up my server initially (which took like half a day) before automating the entire process with Ansible.  If I need to move to a new server or cloud provider, I can pretty much rebuild a fully-configured server in about 5 minutes with one command.  Pretty neat.

    Note: I’ve also ran this playbook successfully on Amazon EC2, Rackspace, and Digital Ocean virtual private servers.


    For those who are in a hurry, simply install Ansible, Vagrant, and VirtualBox (if you don’t have them already), clone the project from GitHub, and type this in from the project directory:

    vagrant up

    Wait a few minutes for Ansible to do its magic.  Visit when finished. Congrats, you just deployed a fully configured Django app!

    The Juicy Details

    Below are some things you should know before using this playbook for your projects.

    Project Structure

    I have my Django project structure set up this way:

    glucose-tracker\ (project directory)

    —-> glucosetracker\ (application directory)

    —-> settings\




    —-> requirements.txt file, scripts, other files and directories I don’t consider part of the application

    If you have the same project structure that I have, then all you really have to change is the env_vars/base file to get started, where you can set the Git repo location, project name, and the application name which are used throughout the playbook.

    If you don’t have the same project structure, you will need to change the group_vars/webservers file as well (and possibly the environment specific vars file in env_vars/ if you don’t split up your settings file), where you can set the path settings to match your project structure.

    Environment Variables

    I like to separate my environment-specific settings from the main code repo for security reasons and for easier management.  For example,  in my Django settings file, I set the EMAIL_HOST_PASSWORD setting to something like:


    This way, I won’t have to leave the password in the code and if I need to change the email password I can do so quickly by changing the environment variable setting on the server instead of modifying the code and re-deploying it.

    The way I have this setup is I created a postactivate script (see roles/web/templates/) that creates the environment variables.  This gets ran after activating virtualenv so those settings are applied only to that virtualenv.

    Now, because I like having all my configurations in my Ansible playbook repo, I keep these values in a vars file and encrypt them with Ansible Vault (see my previous post about this for more details).

    Applying Roles

    The playbooks in the repo apply all the roles.  If you only want certain roles applied to your server, simply remove the ones you don’t need from the roles: section.

    For example, if you don’t use memcached, your roles: section will look something like this.

        - base
        - db
        - web

    Django Management Commands

    In env_vars/, you will see the following settings in the environment-specific vars files:

    run_django_syncdb: yes
    run_django_south_migration: yes
    run_django_collectstatic: yes

    If you don’t want to run some of these commands, simply set the value to no.

    OpenSSL ‘Heartbleed’ Patch

    Don’t worry, the playbook already takes care of this for you.  The first task in the playbook is to do an apt-get update and ensure that openssl and libssl are the latest version. ;)

    I think that pretty much covers most of the questions you might have if you decide to use this Ansible playbook for your Django projects. I might add a few more roles here in the next few weeks, such as Celery, RabbitMQ, and Solr, as we use them at work and we’re currently in the process of automating our infrastructure.

    If you have any questions or suggestions, please feel free to leave a comment below.

    Some Useful Links:

  • How to deploy encrypted copies of your SSL keys and other files with Ansible and OpenSSL

    Posted on April 5th, 2014 webmaster 3 comments         

    I’ve been working on fully automating server provisioning and deployment of my Django app, GlucoseTracker, the last couple of weeks with Ansible. Since I made this project open-source, I needed to make sure that passwords, secret keys, and other sensitive information are encrypted when I push my code to my repository.

    Of course, I have the option to not commit them to the repo, but I want to be able to build my entire infrastructure from code and maintain all configuration in one place, the git repo.

    Fortunately, Ansible has a command line tool called Ansible Vault (comes with the core package) that will allow you to encrypt your configuration files and decrypt them during deployment by passing in the password or password file in the command.  This is mainly useful for encrypting your environment variables files that contain the passwords/keys for your application.

    For example, in my Django settings, instead of assigning the values directly in the settings file, I do something like this:

    SECRET_KEY = os.environ['DJANGO_SECRET_KEY']

    Django would then read the value from the server’s environment variables.  In my case, since I use virtualenv, I have a postactivate script that sets the environment variables for that virtualenv.

    Since I want Ansible to fully automate my server configuration and store all the information that I need to do so in a git repo, I have to encrypt the variables that my app use in production.  For example, I have a file called production in the env_vars folder of my repo that the playbook will use.  I encrypt this file with a password using Ansible Vault and when running the playbook I decrypt it by passing in the password in the command argument.  If you use CI tools like Jenkins you can then just set this password in Jenkins (perhaps as an environment variable) so you won’t have to type it in manually.

    If I need to make a change in the vars file, I can simply just type in:

    ansible-vault edit env_vars/production

    This will prompt me for the password and display the decrypted values in vim for me where I can make my changes and save it back in the encrypted version.  This is a nice option if you just need to make small changes as you won’t forget to encrypt the file again after decrypting it.

    Here’s what an Ansible Vault encrypted vars file will look like:


    What about files that get copied to the server such as private keys for SSL certificates?

    This is where I had to do something extra as the Ansible copy module will copy these files as they’re stored in the repository (i.e. the encrypted version).  To get around this, I simply used OpenSSL to encrypt these files using symmetric encryption and set the password to decrypt it in my vars file (which Ansible Vault will encrypt).

    To encrypt a file with OpenSSL using AES 256 encryption:

    openssl aes-256-cbc -salt -a -e -in ssl_signed/unencrypted.key -out ssl_signed/encrypted.key -k MysupasecuresecretPasswordZ.x!!

    To decrypt an AES 256 encrypted file with OpenSSL:

    openssl aes-256-cbc -salt -a -d -in ssl_signed/encrypted.key -out ssl_signed/unencrypted.key -k MysupasecuresecretPasswordZ.x!!

    Example vars file:

    # Nginx settings.
    ssl_src_dir: ssl_signed
    ssl_dest_dir: /etc/ssl
    ssl_key_password: MysupasecuresecretPasswordZ.x!!

    I have a task in my playbook that would copy my SSL keys to the remote server and run the OpenSSL command to decrypt it (using the password from the ssl_key_password variable in my vars file):

    - name: Copy the SSL cert and key to the remote server
      copy: src={{ ssl_src_dir }}/ dest={{ ssl_dest_dir }}
    - name: Decrypt the SSL key
      command: openssl aes-256-cbc -salt -a -d -in {{ ssl_dest_dir }}/nginx.key
               -out {{ ssl_dest_dir }}/decrypted.key -k {{ ssl_key_password }}
               creates={{ ssl_dest_dir }}/decrypted.key
    - name: Rename the decrypted SSL key
      command: mv {{ ssl_dest_dir }}/decrypted.key {{ ssl_dest_dir }}/nginx.key
               removes={{ ssl_dest_dir }}/decrypted.key

    Now let’s run the production playbook:

    ansible-playbook -i inventory/production –private-key=/aws-keys/ec2-glucosetracker.pem –vault-password-file=~/ansible/decryption_password -vvvv production.yml

    This is just one example and this same simple concept can be applied to different scenarios.  Just to summarize the steps:

    1. Encrypt your files with OpenSSL using symmetric encryption.
    2. Assign the decryption password to a variable in your Ansible vars file.
    3. Encrypt your vars file using Ansible Vault.
    4. Create a task in your playbook to decrypt the encrypted files using OpenSSL and the password in the encrypted vars file.
    5. Run your Ansible playbook, passing in the Ansible Vault password in the command or specifying the file where the password is stored.

    View my entire playbook here:

  • Django Tip: How to configure Gunicorn to auto-reload your code during development

    Posted on March 30th, 2014 webmaster No comments         

    I just finished fully automating my entire server stack for my Django app with Ansible and Vagrant (using VirtualBox).  One of the reasons I did this is to make my development environment as close to production as possible to hopefully eliminate any surprises when deploying to production.  It also allows me to setup a development environment very quickly as I won’t have to deal with manual installation and configuration of different packages.  In a team environment, the benefit of doing this multiplies.

    This is basically my process:

    1. Type in ‘vagrant up’ to create or start the VirtualBox virtual machine.

    2. I have my virtual machine configured via Vagrant to share my local code (which is located in my Dropbox folder) with the virtual machine.

    3. I make a change to my code, then I open my web browser and enter my virtual machine’s IP which is statically set to  The browser shows my changes.

    What I want Gunicorn to do is similar to what the Django runserver does: automatically reload the application server when the code changes.

    There are different ways to approach this, such as using a package called watchdog to watch the file changes and then restart Gunicorn.  But it turned out there’s an even simpler way to do this with Gunicorn by setting the max_requests setting to 1 (full list of settings).  When calling Gunicorn simply add this option (note that it’s 2 dashes in the beginning):

    –max-requests 1

    What this will basically do is tell Gunicorn to restart the process for every request, which would reload your code.  It won’t know whether your code changed or not, it will always reload it.  For production, this is probably not a good idea, but during development it’s a nice simple trick and you won’t really see a difference in performance as you’d be the only user.

    Here’s the shell script that I use to start my Gunicorn process (note that I use Nginx to communicate with Gunicorn via a socket file, also note that I have variables set here which Ansible replaces with the actual values):

    NAME="{{ application_name }}"
    DJANGODIR={{ application_path }}
    SOCKFILE={{ virtualenv_path }}/run/gunicorn.sock
    USER={{ gunicorn_user }}
    GROUP={{ gunicorn_group }}
    # Set this to 0 for unlimited requests. During development, you might want
    # to set this to 1 to automatically restart the process on each request
    # (i.e. your code will be reloaded on every request).
    MAX_REQUESTS={{ gunicorn_max_requests }}
    echo "Starting $NAME as `whoami`"
    # Activate the virtual environment.
    source ../../bin/activate
    source ../../bin/postactivate
    # Create the run directory if it doesn't exist.
    RUNDIR=$(dirname $SOCKFILE)
    test -d $RUNDIR || mkdir -p $RUNDIR
    # Programs meant to be run under supervisor should not daemonize themselves
    # (do not use --daemon).
    exec python run_gunicorn \
    		--settings={{ django_settings_file }} \
     		--name $NAME \
      		--workers $NUM_WORKERS \
      		--max-requests $MAX_REQUESTS \
      		--user=$USER --group=$GROUP \
      		--log-level=debug \
  • Deploying your Django app with Fabric

    Posted on January 25th, 2014 webmaster No comments         

    I’ve been making quite a bit of improvements and changes to my Django app, GlucoseTracker, lately that the small amount of time I spent creating a deployment script using Fabric had already paid off.

    Fabric is basically a library written in Python that lets you run commands on remote servers (works locally as well) via SSH. It’s very easy to use and can save you a lot of time. It eliminates the need to connect to a remote server manually, and if you have a lot of servers to update, then the time savings really add up.

    Since I only have one server for my production environment, my setup and deployment process are very simple.  I use GitHub to store my code, a virtualenv environment where my code is deployed to, Gunicorn for the WSGI server managed by Supervisor, PostgreSQL for the database, and Nginx for the web server.

    My deployment process goes something like this:

    1. Run unit tests.
    2. Pull latest code from the master branch hosted on GitHub.
    3. Activate virtualenv.
    4. Run pip install against the requirements file (in case a new library was added or updated).
    5. Run South migrations for all apps (in case the there were changes in the database/table schemas).
    6. Restart Gunicorn with Supervisor.

    Here’s what my file looks like:

    from fabric.api import local, env, cd, sudo
    env.hosts = ['']
    # The user account that owns the application files and folders.
    owner = 'glucosetracker'
    app_name = 'glucosetracker'
    app_directory = '/webapps/glucosetracker/glucose-tracker'
    settings_file = 'settings.production'
    def run_tests():
        local('coverage run test -v 2 --settings=settings.test')
    def deploy():
        Deploy the app to the remote host.
            1. Change to the app's directory.
            2. Pull changes from master branch in git.
            3. Activate virtualenv.
            4. Run pip install using the requirements.txt file.
            5. Run South migrations.
            6. Restart gunicorn WSGI server using supervisor.
        with cd(app_directory):
            sudo('git pull', user=owner)
            venv_command = 'source ../bin/activate'
            pip_command = 'pip install -r requirements.txt'
            sudo('%s && %s' % (venv_command, pip_command), user=owner)
            south_command = 'python glucosetracker/ migrate --all ' \
                            '--settings=%s' % settings_file
            sudo('%s && %s' % (venv_command, south_command), user=owner)
            sudo('supervisorctl restart glucosetracker')

    To run this script, you first need to install the Fabric library:

    pip install fabric

    Then call the run_tests method by typing:

    fab run_tests

    Deploy with:

    fab deploy

    Make sure to run the fab command in the directory where the file is located.

    For the run_tests() method, you’ll notice that I use the function local(). Since I don’t have a staging environment I just run my tests locally. I then deploy directly to my production server.

    Also note that the SSH session is not persistent, which is why you can see in my script that I combine the virtualenv activation command with the commands dependent on it being active.  I also run my app using a user account named glucosetracker which has limited access to the server to minimize damage in case someone figures out a way to run malicious code through my app.

    That’s pretty much it.  This is just a very simple example and you can do a lot more with it.  It takes very little time to get started, so even for small projects it’s definitely worth checking out.  It’s really nice to be able to make even just tiny changes to your app and have it deployed in seconds by running one simple command.