Ditch List, use Dict in Python

So, one day I was using too much list & wondered why we can’t use dict all the time when it has O(1) time complexity.
Instead of pondering I tried to measure things myself & see. I ended up measuring access time in a dict & in a list. Below is the code.

import time

d = dict()
for i in range(10000000):
	d[i] = True
	
e = list()
for i in range(10000000):
	e.append(i)

t1 = time.time()
print("Found :)" if 45423 in d else "Not found :(")
t2 = time.time()
print("Time taken by dict : ", (t2 - t1))


t1 = time.time()
print("Found :)" if 45423 in e else "Not found :(")
t2 = time.time()
print("Time taken by list : ", (t2 - t1))

Now time to decide the winner and the results are as follows.

C:\Users\pankaj.kum\Documents\python-scripts>python dict_vs_list.py
Found :)
Time taken by dict :  0.0
Found :)
Time taken by list :  0.0009982585906982422

C:\Users\pankaj.kum\Documents\python-scripts>python dict_vs_list.py
Found :)
Time taken by dict :  0.0
Found :)
Time taken by list :  0.0010042190551757812

C:\Users\pankaj.kum\Documents\python-scripts>python dict_vs_list.py
Found :)
Time taken by dict :  0.0
Found :)
Time taken by list :  0.0009982585906982422

C:\Users\pankaj.kum\Documents\python-scripts>python dict_vs_list.py
Found :)
Time taken by dict :  0.0
Found :)
Time taken by list :  0.0010023117065429688

Dict takes very less time ~ 0 seconds.
So, dict or hashed map of unordered type should be convenient to use wherever you have large accessing over an array of elements. But, how to use it.
Lets assume you have list of numbers to take care of like [1, 1, 2, 3, 5, 8]. No cookies for guessing its’ Fibonacci numbers.


# Usual approach
fibList = list()
fibList.append(1)
fibList.append(1)
fibList.append(2)
fibList.append(3)
fibList.append(5)
fibList.append(8)
# You can generate it using any simple looping as well

# Dict approach
fibDict = dict()
fibDict[1] = True
fibDict[1] = True
fibDict[2] = True
fibDict[3] = True
fibDict[5] = True
fibDict[8] = True
# Assign the lightweight True or anything throwaway

I wonder why we don’t use it all the time & ditch list straight away. Here’s the catch.

Whenever the order or permutation of numbers is super important & can’t be accepted in any haphazard order we use list instead of dict.

Universal data structure truth

So, why did I suggested it in first place. Well, there are instances in your life when you have to travel by speed of life through your data structure & you don’t give a damn which piece of it was first.

Python dict or any language’s associative array uses a hash function which calculates address for storing. You store it there & wherever you demand for the stored value using its key the address is calculated (which is a one to one function). You directly go to the required memory location instead of looking all through the data structure. Blazing fast !

Well, what if I need to access it very fast & persist the order at the same time.

Use trees

Trees are data structures which give search O(log(n)) time complexity for searching throughout the data structure.
More specifically self balancing trees like the Red black trees.
That comes out of the box with C++ STL in std::map . I think you’ll have to implement something like this yourself when using python. Tell me if I am wrong or missing something.

CodeChef IDE sucks

Codechef makes programming very challenging

One fine evening I was trying to attempt CodeChef Lunchtime Oct 2019. After grueling with pen & paper I decide to code on CodeChef’s IDE.

Question https://www.codechef.com/LTIME77B/problems/BOXGAM97
My code is as follows:

# cook your dish here
def flip(P):
    if P == 0:
        return 1 
    else:
        return 0

def rightOrLeft(box, P, pos, N):
    if P == 1:
        if pos == 0:
            return 1
        if box[pos - 1] < box[pos + 1]:
            pos -= 1 
        else:
            pos += 1
    
    else:
        if pos == N - 1:
            return N - 2
        if box[pos - 1] > box[pos + 1]:
            pos -= 1 
        else:
            pos += 1
            
    return pos
    
cases = int(input())

for cas in range(cases):
    #Code here
    N, K, P = map(int, input().split())
    
    box = list(map(int, input().split()))
    
    if P == 0:
        pos = box.index(max(box))
    elif P == 1:
        pos = box.index(min(box))
        
    P = flip(P)
    
    for b in box:
        pos = rightOrLeft(box, P, pos, N)
        P = flip(P)
            
    print(box[pos])

Codechef IDE produces error even for demo test case which is

2
4 3 0
4 1 9 5
4 3 1
4 1 9 5

Runs well in geeksforgeeks IDE
Proof : https://ide.geeksforgeeks.org/LV3Eg8xYqV

Same code on different platforms run differently. So, I tried to run it on my Desktop (Windows 10, Python 3.7.3) and it works like charm.
Observe the 9 & 1 in between the inputs. That is the required output for the test case.

C:\Users\pankaj.kum\Desktop>python try.py
2
4 3 0
4 1 9 5
9
4 3 1
4 1 9 5
1

So, to conclude CodeChef has a quite jerky system within. They need to listen to the abuses seriously. Otherwise it will be too late.

P.S GeeksForGeeks, I love you!

Compound indexing with MongoDB

We have our own reasons to go for NoSQL databases. But that doesn’t mean you can’t enjoy blazing-fast indexing features of old school relational databases. It is provided out of the box with smart solutions & even MongoDB.

So, I have data in MongoDB that looks like this.

{
    "_id" : ObjectId("54f612b6029b47909a90ce8d"),
    "Posting Company" : "Zeta",
    "Posting Title" : "Backend Engineer",
    "Division" : "Software developer",
    .....
}
{
    "_id" : ObjectId("2hs7s7s6029b47909a90ce8d"),
    "Posting Company" : "Flock",
    "Posting Title" : "Frontend Engineer",
    "Division" : "Software developer",
    .....
}
.....

Now, under normal fetch operations like below, the web app had a bad time.

db.jobs.find(
    "Posting Company" : "Flock",
    "Posting Title" : "Frontend Engineer",
    "Division" : "Software developer",
);

It took more than 10 seconds to fetch data (even 20 – 30 seconds when the match was large in number). I mean, this is simple stuff right. Let’s do some indexing to nail it.

db.jobs.createIndex( { "Posting Company": -1 } );

But that did not work great. Even for a single record match, it took quite a time. It was still sorting through "Posting Title" and "Division" by going through all records of "Flock". Hard work! I started recording time now to benchmark. I wrote a python code that spat time taken at me like this.

5156 records
db: 3.685041904449463
tat@mmpbacula$ python3 test10.py
5156 records
db: 3.6494805812835693
tat@mmpbacula$ python3 test10.py
5156 records
db: 3.5948400497436523

But it still took 3.5 seconds approx. (I would say 4 seconds). If you ask the old sage relational guys they will tell you to stay away from NoSQL because of these reasons. “Man NoSQL will slow you down”.

But no. Hear something called Compound Index of MongoDB. To quote from the docs…

A single index structure holds references to multiple fields within a collection’s documents .

MongoDB docs on compound index

So, I had three criteria to match at a time & I applied Compound indexing.

db.jobs.createIndex( { 
    "Posting Company": -1,
    "Posting Title": -1,
    "Division": -1 
} );

The measured time is like this now. ( ~ 0.3 seconds)

6245 records
db: 0.343005895614624
tat@mmpbacula$ python3 test10.py
6245 records
db: 0.34383082389831543
tat@mmpbacula$ python3 test10.py
6245 records
db: 0.40097570419311523
tat@mmpbacula$ python3 test10.py
6245 records
db: 0.4061148166656494

It works like lightning and loads under a second. I am impressed. So, refer to manuals of your stuff. They tell you a lot about how to use them efficiently.

Python for competitive programming

Last night I was participating in CodeChef October Cookoff for the first time. It was a good experience even though I could solve one problem completely & two others partially attempted out of total five.

So, there’s a language debate over which should be better for competitive programming. I natively speak in python though I started off with C/C++ family. So, let’s see some comparison over a brute force solution to a problem written in Python (A slow language) and C++ (A fast language).

Python takes > 5.01 seconds (The limit for python on CodeChef is 5.01s).

cookpy

Now, Even C++ fails to nail the brute force approach (The limit for C++ on CodeChef is 1.01s).

cookcpp

So, the crux of the story is…

Concentrate on the correct Logic / Algorithm to solve rather than a certain specific language.

Though this is for the beginners & of course data structure are heavy with Python but for your own sake learn better algorithms each day.

My 10x

I recently read 10x Rule book. Trying to make my own.

1x Goal:
Become 5 star coder on codechef

10x goals:
Reach 7 stars on codechef :: Red user on codeforces

1x Action:
Do 1 problem daily

10x Action:
Do 10 Problems daily

I am solving questions from here
350 questions / 10 questions per day = 35 days required.
I target 30.

This is crazy stuff. Let’s see how promising it is.

Design a site like this with WordPress.com
Get started