First BangPypers Meetup

BangPypers , the Bangalore Python User Group is one of the largest and oldest Python User Group in India. It has been running successfully for more than 10+ years. Most of the known faces in Indian Python Community have been a part of this meetup group at sometime or other.

After I moved to Bangalore I have been on the look out to attend the meetups but I missed it in the first 2 months since some work used to come by on the meetup days. Last Saturday, 21st September, I finally made time to attend my first BangPypers meetup.

The meetup was schedule at 10.30 am at Visa Technology Centre, Bagmane Tech Park. I decided to go by Cycling since I was also looking for a long ride in my Cycle for a while. But alas it took longer than expected owing to a wrong turn in my way.

The theme of this week was Design Thinking and it had 3 Talks scheduled. I reached around 11.15 am almost missing the first Talk which was a Design Patterns 101 Talk. The Speaker was almost in his closing notes. Nevertheless I found a place to seat in the almost full room.

I took out my phone to take notes and waited for the next talk which was on Design Pattern for distributed Architecture . The Talk covered various points on best practices related to distributed systems and tools and techniques to achieve them. Discussion revolved around Horizontal Scalability, Responsiveness, Security, Failure Handling, Centralised Logging, Metrics, Request Tracing, Health Checks, Configuration and Discovery. Since I never dealt with production level Python applications or Distributed systems the terms were quite new to me, few of them I encountered in past but didn’t quite fiddle in depth. I made a note of the discussion for future reference.

The last talk of the day was Organize your bookshelf using Micropython by Vinay Keerthi who was one of the Hosts at Visa, which is also a PyCon India 2019 accepted Talk. He made a LED book shelf organiser which would tell him the position of a book in the shelf by lighting LED array. He used Micropython on a NodeMcu in which he built a simple Flask app which would fetch data from a Postgresql database and store it in a queue which would send signals to the LED. I really liked the idea and it was a really hacky way to find a book. After his presentation I talked to him to get some pointers on starting my first Micropython project which I plan to do in the upcoming days.

After the session I talked to the organisers Anirudha and Abhiram with whom I shared my experiences at Hyderabad Python Community and also how they conduct BangPypers meetups. Visa had arranged lunch for the attendees so the discussion continued at the Lunch table where I got to talk with other attendees and interacted about the work they were doing. Bangalore has a vibrant developer community and there are many more communities like PyData, PyLadies which conduct regular meetups. PyLadies Bangalore are going to have a meetup on coming Saturday, September 28. Here is the announcment about the same.

At the end Vinay shared some more stories of his Hacks and experiences with tinkering with Micropython. And yeah here are some photos of the meetup I took

The thing that made the meetup nice was the great Venue and the Topics of the Talks. With the boom of ML/AI in Python Land having Talks that are unlike those, but most fundamental topics is good. Hope to attend future meetups as well!

Lastly I rode back home completing my first 20km streak in Cycle 😛

I also tweeted about the meetup after I returned home that day.

Takeways from the Meetup

  • Design Patterns in Python – Spend more time learning these.
  • Got to know about best practices in distributed system – Keep notes for futute projects.
  • Micropython getting started – Order some NodeMcu and get the hands dirty. TO DO: Read blogs regarding starter projects.

Advertisements

DevConf India 2019 experience

One month ago I attended DevConf India 2019. It was held from August 2-3 at Christ University, Bangalore. It was quite a while since I attended a conference, the last one being PyCon India 2018 . Due to my laziness in writing I have altogether missed writing my Conference experiences till now. This will be in fact my first Conference blog post. Now on I will make sure to write about each of the conferences I go.

I came to know about DevConf India last year when it was first held, from people in my developer circle. DevConf India is organised by Red Hat and has similar events in US and Czech Republic. Since I was in Bangalore this year I made sure I attend it after I saw the dates in Twitter. I registered as soon as I came to know about it.

Day 1

I mostly planned on meeting up with people and attend a few talks. I started at 8.30 am from my place and unfortunately missed the Keynote owing to a bad experience with a Bike Taxi service. I reached the Venue around 10 am and collected my Attendee badge and T-shirt. Then I headed towards the Keynote Session Hall where I met Naren from Chennaipy. I earlier met him at PyCon India 2018. It was nice catching up with him.

After having breakfast at the venue I headed to the Booth area where I met Chandan . I started visiting the booths asking questions about various projects like Fedora, Debian, CentOS. Shortly after I met up with some more familiar faces from DgplugSayan, Rayan , all of them I met during PyCon India last year. I expected a Dgplug staircase meeting at DevConf but unlike last year there were less attendees this time. After that we went for lunch at the Cafeteria where I met Ramki , Praveen and pjp . Few days earlier I was reading pjp’s tutorial regarding gcc and gdb from dpglug irc logs . It was nice to catch up with him in person. I was discussing with Sayan and Praveen about the initial days of dgplug at my college at NIT Durgapur, attending their first talk in 2013 when I had just joined my college.

After Lunch I decided to attend few talks. I attended a talk regarding Evolution of Containers – there I came across terms like Chroot, Cgroups, Namespaces , how the whole container ecosystem was born. I have been always been inquisitive about containers and though I haven’t really worked on containers before this talk really fascinated me to dive into the world of containers.

Then I attended a talk on What hurts people in Open Source Community . The talk helped to set my expectations right regarding contribution to a Open source project and Community.

After that I went to the Closing Keynote of the day shortly after which we went for evening snacks where we had more discussions over Coffee and Dosa – we noticed a item mentioning ‘Open Dosa’ over which we laughed a lot 😛 . And it was finally close of the day.

Day 2

I reached a little late to the venue and went straight to the talk that I didn’t want to miss. It was a Documentation BoF where speakers were discussing how to create effective documentation and tools for creating collaborative documentation. I came across User Story based documentation and tools of the trade like asciidoc and asciidoctor . I met Ladar Levison there during that session and talked with him regarding better project organisation. He gave me his business Card which mentioned Lavabit . Little did I know about him until I read this article which explained more about Lavabit and his role in Snowden’s secure email communication. But that was after this conference and I wished I could talk more about Privacy and Lavabit projects.

After that I went for lunch with Sayan, Chandan and Rayan where we chatted on lot of different stuff on open source, food and conferences. After Lunch I went to attend Sinny‘s talk on Fedora CoreOS whom Sayan introduced last day.

Finally it was nearing the end of the day. I went to attend the closing keynote by Jered Floyd and sat beside Christian Heimes from Red Hat who was sharing anecdotes from his travel experiences.

Notes from the Conference

I made few notes that I would like to share from my experience at the Conference and also as a note to me for future Conferences

  • Try to search about people you meet so that you can know more about them. You may not know everything about the person you are talking with. But actually the person can be a mine of knowledge. Ask for the person’s email/Twitter so that you can follow up on email or Twitter after the conference.
  • It’s always good to prepare some questions if you are likely to meet a person you met/knew online. You have the opportunity to talk face to face and ask about the projects the person works on. You can even do that within the conference when you are free.
  • When you attend a Talk ask good questions that can start a conversation. Usually people take a interest in following up after the Talk with you and you get to talk to more people.
  • It’s always good to be Speaker at the Conference. That way there is higher chances that you can start a conversation with people you don’t know and meeting for the first time. This is something that I really need to work on and hopefully I will be able to submit a talk in the next Conference I attend.
  • When going for Lunch tag along with a group so that you get meet more people. If you are an introvert this really works well as you can meet friends of friends and you can interact much easily!

And yes don’t forget to take pictures 🙂 It really bring memories. It may sound weird but this is something I really forget every time I meet up with people and later wait for Conference photos.

A New City, A New Beginning

Two and a half months earlier I shifted to Bangalore after a 2 years stint at my first Company at Hyderabad. I was looking for new opportunities and started appearing in interviews when I hit the hard realisation that what I learned till now was not enough. I need to self-learn more and interact more with people and learn from stories that people experienced. I landed up a job at a EDA based firm at Bangalore and decided to move in there.

Leaving the city, but not the memories !

Time really flies and it had been already 2 years working in the Software Industry. I met with a lot of people, made friends and shared some good experiences. I became aware of the meetup culture, realised writing code is not the only way that makes you a good developer, realised that communication is a key factor in conveying your ideas.

I became part of the HydPy Community and organised 2 conferences – PyConf Hyderabad 2017 and PyCon India 2018 . I met some like minded people passionate about technology and flourishing the Python Community at Hyderabad. I still am a part of the Community and hope to continue doing so.

I also attended the dgplug Summer Training in 2018, made many friends there as well whom I keep meeting during conferences. They are a really amazing community and there is always something to learn from each of the IRC conversations.

I interacted with a lot of people from Python and open source Community. Going to Conferences and meetup is really a good way to interact with some awesome people and sharing their knowledge. But it’s also true that you really need to work on something tangible and develop your skills. Then only you can move forward and also put the experiences to use.

I have been living in the heart of the fast growing city at Hyderabad. It’s been a nice experience here as a whole (except from traffic on rainy days!). I explored to places around in the initial days. Some significant events happened during my stay – the GES 2018 was held, Hyderabad metro rail was inaugurated, the first IKEA store in India opened up . I Spent my first Durga puja away from home. Realised that Kolkata Biryani is still better than Hyderabadi Biryani. Saw more buildings, Tech parks and flyovers being constructed in Rapid phase. Got to experience the hot summer and pleasant winters. All together it had been a good time in the city of Nizams. Hope to come again here someday !

The New Phase, What Next ?

It’s been 2.5 months while I came here at Bangalore. It’s called the Garden City of India because of the amount of greenery it has. But I would rather call it a City of Traffic ! Dealing with traffic can be terrifying here. Nevertheless there is a lot of greenery left and I see palm trees here and there quite often. The area I live is booming with an array of multinational companies, I also observed more number of semiconductor based companies. Rightly calling it “Silicon Valley of India”. It’s pretty early to state my experience in this city so maybe I will write about it in the next phase 🙂

So what next ? I want to make my time here more productive and also develop some good habits that I have been lingering about. Get into the habit of writing, take time out for more self-learning, contribute to open source more often, interact with more people, put the experiences to use as much as I can, get more exercise and cut laziness. In fact I got me a cycle and have been doing my daily commute with it! I know now it’s more talk than work now. But I want this blog post as a reminder so that whenever I wander off I can come to this page and find what I need to do! And keep myself prepared for the next phase.

Adding Print Preview Support to CEF

Chromium Embedded Framework (CEF) is a framework for embedding Chromium-based browsers in other applications. Chromium itself isn’t a extensible library. It helps you to embed a chromium browser in a native desktop application. The project was started by Marshall Greenblatt in 2009 as an open source project and since then it has been used by a number of popular applications like Spotify, Amazon Music, Unreal Engine, Adobe Acrobat, Steam and many more (full list can be found in this wiki). CEF is supported for Windows, Linux and macOS platforms.

There are 2 versions of CEF – CEF1 and CEF3. CEF1 is a single process implementation based in Chrome Webkit API. It’s no longer supported or maintained. CEF3 is a multiprocess implementation based on Chromium Content API and has performance similar to Google Chrome.

Preface

The purpose of writing this article is to document the work that I did while working on the CEF project. The major focus is on the Print Preview addition to CEF upstream project where I worked on.

For the past 1 year I have been working on CEF as a part of my day job. Initially the work was to maintain updates of CEF with every Chromium version released. We had a fork of the CEF open source project in which we applied some extra patches as per the requirements of the custom desktop application that used it. Building CEF was quite similar to building Chromium. It had a build script which was mostly used from downloading the code, building and packaging the binaries. Everything is documented here in this link . Upgrading the CEF was quite some task since building CEF took a lot of time and resources (a lot of CPU cores and a lot of Memory) and since CEF was based out of chromium I had to skim through parts of chromium code. Chromium has a nice developer documentation and a good code search engine that eased a lot of things. But owing to the fact that chromium has a huge codebase the documentation was outdated in few areas.

Feature Description

The interesting part of the CEF project came when I was handed over a work of a missing piece in CEF. Chromium supports in browser print preview where you can preview pages before printing, similar to one shown in the picture below.

CEF didn’t support this feature and had the legacy print menu where you cannot preview the pages to be printed.

This meant applications that used CEF couldn’t support print preview within them. The task was to make print preview available in CEF.

Initial work

The work started with CEF version 3112 (supported chromium v60) and was in a working state in CEF 3239 (supported chromium v63) in our CEF fork. Then the change was supported only in Windows since our desktop application that used it was a Windows only application. I was handed over the work in CEF 3325 (supported chromium v65) where the following specs already existed in the Print Preview patch. The relevant blocks of code is available in the CEF source code now.

  • enable_service_discovery is disabled now
  • CefPrintViewManager::PrintPreviewNow() will handle print-preview
  • CefPrintViewManager will not handle the print now. It handles the PrintToPdf function exposed through browserHost. Also, it listens to two print preview message PrintHostMsg_RequestPrintPreview, PrintHostMsg_ShowScriptedPrintPreview and generate the browserInfo for print preview dialog
  • Define interfaces from web_modal::WebContentsModalDialogManagerDelegate and web_modal::WebContentsModalDialogHost required by the constrained window of print preview dialog
  • Define platform specific definitions of GetDialogPosition() and GetMaximumDialogSize() used by browser platform_delegate
  • Register Profile preferences required for printing
  • Add ConstrainerWindowsViewsClient class which is called in browser_main.cc
  • CefPrintViewManager::InitializePrintPreview() initializes the print preview dialog which further calls PrintPreviewHelper::Initialize() which generates the browser_info required by print preview
  • Remove CefPrintViewManagerBase and its associated methods from CefPrintViewManager. Those methods are redundant after the print preview changes
  • Check for switches::kDisablePrintPreview in CefPrintRenderFrameHelperDelegate::IsPrintPreviewEnabled() to determine whether print preview is enabled
  • Add chromium patch to fix errors in debug build

My Contribution to CEF Print Preview

After I took over the CEF print preview work I did a number of changes to print preview, specs of which is documented below

  • Add print_preview_resources.pak in BUILD.gn to fix blank screen error which came from chromium v72 onwards because of updated print preview UI
  • Add PrintPreviewEnabled() to extensions_util
  • Add switches::kDisablePrintPreview to kSwitchNames in CefContentBrowserClient::AppendExtraCommandLineSwitches() to disable print preview on using --disable-print-preview command line switch
  • Remove print_header_footer_1478_1565 chromium patch since it’s no longer required to disable print preview by default
  • Add WebContentsDialogHelper() to wrap web_modal::WebContentsModalDialogManagerDelegate and web_modal::WebContentsModalDialogHost interface methods. Move it in a separate header and cc file
  • Add support for disabling print preview via chromium preferences
  • Disable print preview for OSR mode
    • For ui print this is done by using CefBrowserHostImpl::IsWindowless() method
    • For scripted print this is done by passing the is_windowless parameter in CefPrintRenderFrameHelperDelegate object from CefContentRendererClient::MaybeCreateBrowser() method in content_renderer_client.cc
  • Fix DownloadPrefs::FromBrowserContext for CEF use case. Add GetDownloadPrefs() to CefBrowserContext and use that to get the download_prefs in chromium
  • Add extra_info param to CreatePopupBrowserInfo()
  • Fix MacOS build. Add GetNativeView() method for MacOS platform delegate and update GetMaximumDialogSize() method
  • Disable print preview for MacOS (in extensions::PrintPreviewEnabled()) since the print dialog crashes on print
  • Disable print preview if pdf extension is not enabled
  • Use CEF file chooser instead of chromium while saving to pdf in print preview. Add ShowCefSaveAsDialog() and SaveAsDialogDismissed() to PdfPrinterHandler.

Challenges faced

Integrating print preview was a big and non-trivial change in CEF since not only it needed good understanding of the printing code in chromium but also the print preview feature was getting constant updates from Chromium. The code was constantly changing with every chromium version released and the print preview chromium documentation was outdated.

CEF3 has a multiprocess architecture similar to chromium’s documented here . There is a main browser process and multiple renderer processes. Debugging multiprocess applications can be trickier. I used Visual Studio which made things a bit easier as it has the Child process Debugging power tool which is a extension that would automatically attach Child processes and helped to debug into the child processes whenever they spawned up.

The chromium v72 version introduced a new print preview UI which broke the renderer, we got a blank screen in print previw. It took weeks to figure out what was wrong. Finally it came out that a pak file was missing which needed to be included in BUILD.gn. I had to spend multiple debugging session with my team to figure that out.

Also it had to be supported for all platforms (Windows, Linux, macOS) to qualify to be merged to the CEF upstream repo. Each platform had a different way of rendering dialogs. Though the windows support was working the Linux and MacOS weren’t supported in the changes yet. I added the support for Linux platform after building CEF in a linux VM. The MacOS support finally didn’t work out and we had to keep using legacy print for Mac platform. Though I needed to ensure the change built fine in Mac, so I had to build it for Mac as well (I was given a separate Mac machine just because of this since Mac doesn’t ship VM images) and in fact the change broke the MacOS build so the issues had to be fixed.

Conclusion

Even after all these changes the functionality broke after a architectural change was made in CEF in version 3770 (supported chromium v75) in this commit which rendered blank screen during print preview. Marshall took over the work from there and made a number of changes in the patch which can be seen in this PR chromiumembedded/cef#126 . The change was added manually in master revision 1669c0a on 20th July. It will be supported from the next CEF version (supported chromium v77). The current implementation supports print preview in Windows and Linux platforms via a --enable-print-preview flag.

Overall it has been a good experience working in the project and I got to know a lot about chromium itself. This was my first major contribution in a C++ based project. It helped me to understand how a browser works under the hood, how a browser renders web pages and processes javascript. I hope to carry forward this knowledge in some similar browser based project in future.

Rescuing GRUB2 from rescue mode in Fedora 30

About 3 months back I installed the newly released Fedora 30 OS – dual boot with Windows 10 in my PC. This blog post comes from the notes I made during that time and as I troubleshooting note for future.

I had Fedora 29 and Windows 10 dual boot in my PC before that. The Fedora install partition was running out of space due to low disk space selected during my last install so I decided to do a clean reinstall this time. I made a live usb using the Fedora media writer for windows and the Fedora 30 iso available at getfedora download page. I followed the usual steps that I followed for installing earlier Linux installations in my PC, similar to what mentioned in this video.

The installation went fine and finally I was ready to boot from my hard drive. Then I saw what is called the Dreaded GRUB2 boot prompt.

error: unknown filesystem.
Entering rescue mode...
grub rescue>

First Attempt for Fix

I quickly started finding way to fix the grub. The first thing I found was the steps listed in this video. I had to choose the right partition from which the bootloader would load.

grub rescue> ls
(hd0) (hd0,msdos4) (hd0,msdos3) (hd0,msdos2) (hd0,msdos1)

This shows the various partitions in my hard drive. One of this is the linux partition where my Fedora 30 OS is installed. I need to list all the partitions and one of them will have the linux filesystem.

grub rescue> ls (hd0, msdos4)/
bin/  boot/  dev/  etc/  home/  lib/  lib64/  lost+found/  media/  mnt/  opt/  proc/  root/  run/  sbin/  srv/  sys/  tmp/  usr/  var/

Now I ran the following commands and waited for the system to boot up

grub rescue> set prefix=(hd0, msdos4)/boot/grub2
grub rescue> insmod normal
grub rescue> normal

I got the same grub rescue boot prompt but this time with a different error

error: file '/boot/grub2/i386-pc/normal.mod' not found.
Entering rescue mode…
grub rescue>

Second Attempt ..

The issue was that, the i386-pc folder was missing in the /boot/grub2 folder. The fix that I found for the issue was related to grub2 not being properly installed at the boot location. Luckily I was able to boot Fedora from UEFI Boot option from the Boot Menu. I logged into Fedora and reinstalled grub2.

$ sudo grub2-install /dev/sda
$ dnf install grub2-efi

I hoped that this would fix the issue, but it again came down to the same starting point loading the grub2 rescue prompt.

Third time is the charm !

I further searched and landed up in the fedora grub2 manual . After reading it I realized there is something wrong in my grub2 configuration. I booted into my OS using UEFI boot and opened /boot/grub2/grub.cfg file. The entry for Windows was missing. I followed the steps given in this section. I went to the grub rescue prompt and fired the following commands

grub rescue> set root=(hd0, msdos4)
grub rescue> linux /boot/vmlinuz-5.1.20-300.fc30.x86_64 root=/dev/sda4
grub rescue> initrd /boot/initramfs-5.1.20-300.fc30.x86_64.img
grub rescue> boot

Then I recreated the grub.cfg file using these commands

$ grub2-mkconfig -o /boot/grub2/grub.cfg
$ grub2-install --boot-directory=/boot /dev/sda

Voila ! I was able to see the grub menu with all the boot entries.

Postmortem Report

So why did the issue actually occur ? It didn’t happend in the past whenever I did a fresh installation nor it’s a issue specific to Fedora 30. I tried to dig the actual cause of the issue and after a little finding and introspection I came to this conclusion.

Using the Fedora media writer was in fact the place where I unknowingly did a mistake. I usually used UNetbootin for creating my linux live usb in past which made the images to boot in BIOS only mode. The Fedora media writer enables to boot in both UEFI and BIOS mode. My Windows installation boots via Legacy boot and it has always been like that and since I have been using UNetbootin earlier, it always picked up Boot via BIOS for the live images. This time while creating the Fedora 30 image using Fedora Media writer the default boot mode picked up was UEFI and that created a EFI GRUB2 configuration. Now when I booted the live usb I just booted from the “Boot from USB” option without noticing whether it was UEFI or BIOS  and I went ahead and did Fedora 30 installation. Now my default boot option was Legacy Boot (since it supports the Windows boot) while the installed Fedora OS grub loader was created to boot in EFI mode. That in turn caused this problem due to a corrupted grub2 configuration.

Moral of this Story

Always be careful while created OS images. Check how it is supposed to boot. In case of dual boots all the OS must boot via the same mode – either both UEFI or both BIOS. So make sure when you are doing a clean install of the second OS it must boot via same mode as the already installed OS.

Understanding python requests

In this post I am going to discuss the python-requests library. Python-requests is a powerful HTTP library that helps you make HTTP(s) requests very easily by writing minimal amount of code and also allows Basic HTTP Authentication out of the box. But before I write this post I want to describe the motivation behind me writing this post.

When it comes to writing software, libraries are a lifesaver. There is a library that addresses almost every problem you need to solve. That was the case for me as well. Whenever I used to face a specific problem I would look to see, if a library already existed. But I never tried to understand how they were implemented, the hard work that goes into building them, or the folks behind the libraries. Most of the libraries we use these days are open source and their source code is available somewhere. So we could, if we wished to, with a little hard work, understand the implementation.

During a related discussion with mbuf in #dgplug channel, he gave me a assignment to understand one of the libraries I have recently used and understand what data structures/algorithms are used. So I chose to look inside the source code of python-requests . Let’s begin by understanding how two nodes in a network actually communicate.

Socket Programming : The basis of all Networking Applications

Socket Programming is a way of connecting two nodes in a network and letting them communicate with each other. Usually, one node acts a server and other as a client. The server node listens to a port for an IP, while the client reaches out to make a connection. The combination of port and an IP is called a socket. The listener socket in the server listens to request from the client.

This is the basis of all Web Browsing that happens on the Internet. Let us see how a basic client-server socket program looks like

 

 

 

 

As you can see a server binds to a port where it listens to any incoming request. In our case it is listening to all network interfaces 0.0.0.0 (which is represented by an empty string) at a random port 12345. For a HTTP Server the default port is 80. The server accepts any incoming request from a client and then sends a response and closes the connection.

When a client wants to connect to a server it connects to the port the server is listening on, and sends in the request. In this case we send the request to 127.0.0.1 which is the IP of the local computer known as localhost.

This is how any client-server communication would look like. But there is obviously lot more to it. There will be more than one request coming to a server so we will need multi-threaded server to handle it. In this case I sent simple text. But there could be different types of data like images, files etc.

Most of the communication that happens over the web uses HTTP which is a protocol to handle exchange and transfer of hypertext i.e. the output of the web pages we visit. Then there is HTTPS which is the secure version of HTTP which encrypts the communication happening over the network using protocols like TLS.

Making HTTP Requests in Python

Handling HTTP/HTTPS requests in an application can be complex and so we have libraries in every programming language that make our life easier. In Python there are quite a few libraries that can be used for working with HTTP. The most basic is the http.client which is a cpython library. The http.client uses socket programs that is used to make the request. Here’s how we make a HTTP request using http.client

 

 

 

For making Requests that involve Authentication we have to use Authorization headers in the request header. We have used the base64 library here for generating a Base64 encoded Authorization String.

Using python-requests for making HTTP requests

The http.client library is a very basic library for making HTTP requests and its not used directly for making complex HTTP requests. Requests is a library that wraps around http.client and gives us a really friendly interface to handle all kinds of http(s) requests, simple or complex and takes care of lots of other nitty gritty, e.g., TLS security for HTTPS requests.

Requests heavily depends on urllib3 library which in turn uses the http.client library. This sample shows how requests is used for making HTTP requests

 

You can see making requests is much simpler using requests module. Also it gracefully handles which protocol to use by parsing the URL of the request

Let us now go over the implementation

Inspecting requests

The requests api contains method names similar to the type of request. So there is get, post, put, patch, delete, head methods.

Given below is a rough UML class diagram of the most important classes of the requests library

When we make a request using the request api the following things happen

1. Call to Session.request() method

Whenever we make a request using the requests api it calls a requests.request() method which in turn Calls the Session.request() method by creating a new session object. The request() method then creates a Request object and then prepares to make a request.

2. Create a PreparedRequest object

The request() method creates a PreparedRequest object using the Request object and prepares it for request

3. Prepare for the Request

The PreparedRequest object then makes a call to the prepare() method to prepare for the request. The prepare method makes a call to prepare_method(), prepare_url(), prepare_headers(), prepare_cookies(), prepare_body(), prepare_auth(), and prepare_hooks() methods. These methods does some pre-processing on the various request parameters

4. Send the Request

The Session object then calls the send() method to send the request. The send() method then gets the HTTPAdapter object which makes the request

5. Get the Response

The HTTPAdapter makes a call to its send() method which gets a connection object using get_connection() which then sends the request. It then gets the Response object using the request object and the httplib response from httplib library (httplib is the python2 version of http.client)

And now onwards, How does a request actually get sent and how do we get a httplib response ?

Enter the urllib3 module

The urllib3 module is used internally by requests to send the HTTP request. When the control comes to the HTTPAdapter.send() method the following things happen

1. Get the Connection object

The HTTPAdapter gets the connection object using the get_connection() method. It returns a urllib3.ConnectionPool object. The ConnectionPool object actually makes the request.

2. Check if the request is chunked and make the request

The request is checked to see if it’s chunked or not. If it is not chunked a call to urlopen() method of ConnectionPool object is made. The urlopen() method makes the lowest level call to make the request using the httplib(http.client in python3) library. So it takes in a lot of arguments from the PreparedRequest object.

If the request is chunked a new connection object is created, this time, the HTTPConnection object of httplib. The connection object will be used to send the request body in chunks using the HTTPConnection.send() method which uses socket program to send the request.

3. Get the httplib response

The httplib response is generated using the urlopen() method if the request is not chunked and if the request is chunked it is generated using the getresponse() method of httplib. Httplib then uses socket program to get the response.

And there you have it! The most important parts of the requests workflow. There is a lot more that you can know by reading the code further.

Libraries make the life of a developer simpler by solving a specific problem and making the code shareable and widespread. There’s also a lot of hard work involved in maintaining the library. So in case you are a regular user of a library do consider reading the source code if its available and contributing to it if possible.

Thanks to kennethreitz and the requests community for making our life easier with requests!

References

  1. https://www.geeksforgeeks.org/socket-programming-python/
  2. https://docs.python.org/2/howto/sockets.html
  3. https://en.wikipedia.org/wiki/HTTPS
  4. https://docs.python.org/3/library/http.client.html
  5. https://github.com/requests/requests
  6. https://github.com/urllib3/urllib3
  7. https://tutorialspoint.com/uml/uml_class_diagram.htm

Also many Thanks to #dgplug friends for helping me improving this post.

Understanding DNS

In this post I will explain How DNS Works in my own words.

DNS stands for Domain Name System. It is way for naming the Address of computers or resources in a Network. All of the computers in a network is associated with an IP Address, like IP address of google.com is something like 172.217.163.110. Now its tough for a person to remember such big numbers. So we have a simple and more human friendly way of naming these computers which is done by DNS.

So now how does the DNS work ? When we type in a website name in a browser the IP address of the server mapped to the name needs to be found out to display the web page. So there is a process by which this IP address is searched. Let’s understand what are the steps that are carried out for  finding the address.

First, Ask the Browser

Whenever we type in a website name in a browser the browser searches its cache to check if the IP address mapped to the website is present.

Then, Ask the OS

If the Browser doesn’t have the address it then asks the OS to check if it has the address.

Ask the Resolver

If the OS doesn’t have the address it points to the IP address of the Resolver server that will Resolve the IP address of the website. It is the role of the Resolver to find the IP address of the website and bring it back to the OS. These are usually the Servers provided by the ISPs serving the Internet. If you do a cat /etc/resolv.conf in a Linux Machine you will get an output similar to

# Generated by NetworkManager
nameserver 202.88.174.6
nameserver 202.88.174.8

These are the IP addresses of the Resolvers that are responsible for finding the IP address of  website when a request comes to it. So it first checks its cache to see if it has the IP address of the website requested. If it doesn’t find the IP address it then goes to the Root to find the same.

Ask the Root

The Root server knows the addresses of the Top Level Domain (TLD) Server for the website. There are total 13 Root Servers spread all over the world. Well that doesn’t mean there are only 13 servers. Basically it means there are 13 unique names for the server. Each one is distributed over multiple servers to handle the load. The Resolver gets the address of the TLD Server from the Root and goes there to find the IP address of the website. Each time the Resolver gets the address it saves it to its memory.

Ask the TLD Server

The Top Level Domain is the .com part in google.com . Similar to that there can be various Top Level Domains such as .org, .gov, net, .edu etc. Also there are country specific  domains like .in, .us. .jp etc. The Root server knows the addresses of these TLD servers. The TLD then gives the address of Authoritative the Nameservers for the website domain.

Ask the Authoritative Nameservers (DNS Servers)

The Authoritative Nameservers are the one that contains the actual address of the website. Their names are similar to ns1.google.com, ns2.google.com etc. These are often simply called the DNS Servers as they contain the records of the address corresponding to a specific website name. Whenever you purchase a Domain the Domain Registrar send the name of these DNS Servers to the the TLDs. That way a TLD can say which DNS Server contains the address of a website. The DNS Server gives the Resolver the IP address of the website.

You can find the names of the DNS Servers of a website and the website IP address using the dig command. Here is a sample output of dig command.

[ananyo@localhost ~]$ dig google.com

; <<>> DiG 9.11.4-RedHat-9.11.4-1.fc28 <<>> google.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 33490
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 4, ADDITIONAL: 9

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: 8114e77595b2cc097e7725a95b59fdf1b2d1c4d480039c49 (good)
;; QUESTION SECTION:
;google.com. IN A

;; ANSWER SECTION:
google.com. 40 IN A 172.217.163.78

;; AUTHORITY SECTION:
google.com. 31267 IN NS ns3.google.com.
google.com. 31267 IN NS ns2.google.com.
google.com. 31267 IN NS ns4.google.com.
google.com. 31267 IN NS ns1.google.com.

;; ADDITIONAL SECTION:
ns1.google.com. 206437 IN A 216.239.32.10
ns2.google.com. 210049 IN A 216.239.34.10
ns3.google.com. 210049 IN A 216.239.36.10
ns4.google.com. 210049 IN A 216.239.38.10
ns1.google.com. 210874 IN AAAA 2001:4860:4802:32::a
ns2.google.com. 341654 IN AAAA 2001:4860:4802:34::a
ns3.google.com. 57401 IN AAAA 2001:4860:4802:36::a
ns4.google.com. 304702 IN AAAA 2001:4860:4802:38::a

;; Query time: 35 msec
;; SERVER: 202.88.174.6#53(202.88.174.6)
;; WHEN: Thu Jul 26 22:29:29 IST 2018
;; MSG SIZE rcvd: 331

Finally, return it to the OS, then Browser

The Resolver finally gives back the IP address of the website to the OS which then caches it for future requests. The OS then gives it back to the Browser which sends the request to the IP address and serves the Page. So if you enter the IP address of the google.com in the browser that we got from dig command (172.217.163.78) it will point to the same page.

The Best part of this is the entire things just takes few seconds to complete!