Tuesday, February 13, 2018

ImportError: libGL.so.1: cannot open shared object file: No such file or directory

When I was trying to visualize a simple plot, this error: "ImportError: libGL.so.1: cannot open shared object file: No such file or directory" popped out of the blue when I was trying to debug my code.

Quick google search later, it was suggested that I use
import matplotlib
matplotlib.use("Agg")

Fine... and it did work! no more error message.... but also, i stopped seeing my plot! :)
No good.

After digging deeper, I discovered that Red Hat does not ship native OpenGL libraries, but does ship the Mesa libraries: an MIT licensed implementation of the OpenGL specification.
You can add the Mesa OpenGL runtime libraries on your system by installing the following package:

# sudo yum install mesa-libGL

Problem was solved!

Happy coding

Wednesday, February 7, 2018

Add github master to your project

If you have active project on your laptop and some central git already in place, follow these simple steps to add github as extra back up repo to your project.


git remote add github https://github.com/your_name/repository_name.git
# push master to github
$ git push github master

Fun with apply function in pandas

While doing some data munging, I came across one issue that had me running in circles and made me recheck my logic over and over...

I was getting duplicates in my output set while applying 'apply' function to a dataframe. More specifically, I used 'apply' function on a groupedby dataframe to process each batch of records and apply set of business rules to an external file.

Like I said, I had duplicates!

So.... the reason to this as it came apparent to me was the fact that 'apply' function needs to figure out if you were to be mutating the passed data in order to take a fast or slow path to execute the code. Hence I would have duplicate in my external file, which you won't notice if you were to apply function to the dataframe itself, in which case it would just write to it twice replacing the old same value and it would be transparent to you.

References:
https://github.com/pandas-dev/pandas/issues/7739
https://github.com/pandas-dev/pandas/issues/6753