Learning Python the Hard Way

Being new to programming, I was learning a lot of things at a very fast pace (atleast it felt like that to me).

This post is kinda continuation of Learning Scripting the Hard Way. I’ll talk about how I learnt python and wrote multiple applications in the same, all in THREE DAYS! (yes, you read it right!)

First Day: Start Learning Python + Build Unique Email Filter :)

My mentor (as part of the ACA Project) told me to start learning python on Feb 17 at about 4 AM in the morning. The interesting part is that 3 minutes later I was given the first task (pending submission night of Feb 17 itself).

The first task was as follows:

Create a script filter.py that takes two command line arguments. The script reads the first argument file, extracts all unique email addresses and outputs it to the second file. You will be marked, more marks for shorter code. For this question, shorter is more important than cleaner code. Also, you are not all allowed to use any package that is not preinstalled in python2.7.

Sleep was already far gone, I spent the next few hours on python documentation pages, online blogs, some videos as well (although not that useful). It was invigorating 5-6 hours, learning a new language, its intricacies, especially how “short” of a code can we write in python :P

In afternoon, the same day, I started writing code for my assignment, figured out the regex and finally submitted the following code as my answer:

import sys,re
g,u=[],{}
r=r"[\w\+\._-]+@\w+[\w*\.]+\w+"
f=open(sys.argv[1],'r')
for l in f:
    g+=re.findall(r,l)
for i in g:
    u[i]=1
sys.stdout=open(sys.argv[2], 'w')
print ("\n".join(u.keys()))
sys.stdout.close()

For the above code, I was given full marks for correctness but half marks for length/perfection. Since the criteria in the question said - “shorter is more important than cleaner code”, I messed that up :(

But still this was a fun start.

Second Day: Grep (in Python)

On the night of first day (Feb 17), after the submission of email filter, I was given the second task for the next day itself.

The task:

Create a version of grep in python. Default action would be that of -P (supports all perl regex (same as python “re”)). You have to implement -o, -r flag as well, which may be passed by the user. In case of behaviour you are not sure of, experiment running grep and seeing it’s output.

  • Readability>Length
  • Some points for efficiency.
  • Extra bonus for colorised match like grep (when -o is not passed)

To keep things interesting, I’ll also add the marking scheme (out of 20) here:

  • General working - 7
  • Correct working of “-o”:
    • on single output per line - 3
    • on multiple per line - 3
  • Colorized output - 3
  • Recursive functionality - 4

I won’t bore you with the details of how much time I spent learning and coding this all up, one good thing was - working so rigorously on scripting for the last month, I was very comfortable with grep and regex.

The final code I submitted (on the night of Feb 18) can be found on Github.

I wasn’t able to implement color but implemented 4 extra flags:

  • -h, print without filename headers
  • -i, ignore-case, case-insensitive search
  • -l, print only filenames with matches
  • -n, print line numbers, indexed from 1

I got 17 + 4 (Bonus) marks, since color was not working and bonus for extra flags.

Something cool I discovered during this - shebang. It was pointed out that I missed putting the following line on the top of my python script:

#!/usr/bin/env python

Interesting fact about shebang - for interpreters, the recommended way is to write:

#!/usr/bin/env python

instead of

#!/usr/bin/python

Can you guess (or you already know) why?

The answer (if you guessed it), is you cannot always know the absolute path of the binary in the script executor’s system, and when you run python with env, it loads the PATH variable and runs whatever python is in the path, hence you don’t need to worry about the exact path of the binary (not even the version).

Third Day: Passing the Parcel (via ZeroMQ)

On completing the second task, we moved on to the next task for the third day. I was told to learn zeromq, a communication library.

The collaborative task (each person had to define his own script):

You have to create a script that plays “passing the parcel” with the other scripts i.e. script0 will send a message to script1, script1 will send a message to script2 and so on. The game to be played is “get to 1” (Collatz Conjecture game).

  • You start with a number.
  • Tell it to neighbour.
  • If even, divide by 2.
  • If odd, multiply by 3 and add 1.
  • Pass updated number to neighbour.
  • If you hit 1, just pass 1 and quit.
  • If you get 1, pass it and quit.

This way, all scripts quit. whichever script is passed the argument start, will start with an initial RANDOM number > 1000.

I started reading on ZeroMQ, it seemed very simple (and well it was :P). A short description based on google search - “ZeroMQ is an open-source asynchronous messaging library aimed at use in distributed or concurrent applications”.

How I would describe ZeroMQ - “It is a messaging queue for transferring data between 2 or more parties asynchronously, i.e. producer can keep filling the queue without waiting on consumers and vice versa”, it is a very simple answer but it explains the concept. To accomplish this, there are multiple messaging patterns which it supports, I will just list them here, you can read in detail about it online :)

  • Request-Reply (kind of like server-client with acks, think of something like a telephone call).
  • Publish-subscribe (kind of like how news subscription works, you can subscribe to a newspaper publishing company).
  • Push-Pull (kind of like producer-consumer model, a producer can evenly distribute data to a number of consumers).
  • Exclusive pair (2 sockets connected exclusively).

Finally, having a clear understanding of ZeroMQ, I started reading up on its python library. It was simple to understand now.

I wrote the final script, the code for this can be found on Github. This could be trivially tested by running multiple copies of it after updating port values.

Passing the parcel (with my scripts)

Passing the parcel (with my scripts)

Conclusion

After this, I spent the next week going through python on my own and discovered many interesting things including a very important fact:

It is very easy to write bad code in python!