Sunday, August 21, 2016

Assessing Performance of B-Schools: An Approach Using Data

The motivation to write this article came when I came across Section 197 Companies Act India (and learn data scraping using python! :P ) For the complete technicalities you may read this article. In a nutshell, it specifies a limit to what the Board of Directors and managers can be paid by a public company. And on top of that, this information has to be disclosed in the annual report.

Yes, a publicly listed company has to reveal what it is paying to its top management. My next step, was to search the report of one of the most prestigious companies in B-schools. And I was not disappointed.

Check out HUL's Annexure to the annual report (2015-16).
You will realize that it is a rich(no pun intended) source of data (check out the remuneration, it is mind boggling). And it got me wondering. Can I use this to see which B-school alumni is really on a roll? If I build this across multiple companies,will I find a coterie of institutes at the top?

This led me to writing a script which would pull names from such data and match it with their education. Following is the result-

This has been taken from top FMCG companies- HUL, ITC, Nestle,Marico, Castrol,Asian Paints and HCL Tech (the only non-fmcg company)


  • The results support common knowledge- IIMs and XLRI lead the pack.
  • But what is interesting to see is that IIM Calcutta and XLRI have a higher number of alumni at the top level as compared to IIM Ahmedabad. (Perhaps I should collect more data)
  • What is even more interesting is that Chartered Accountant's rule the roost and have a good amount of representation in all the companies. All the CFOs were Chartered Accountants.
  • All the HR heads belonged to XLRI
  • HUL employs the highest number of MBA graduates spread across all premier B-schools.
  • Even if the number of companies analyzed is not a lot, it is safe to assume that there is a cluster of institutes (IIM C, XLRI and IIM A) which has a strong presence in the FMCG sector.
  • It will be interesting to see if the newly established/emerging campuses (IIM K,I, SPJIMR) will be able to make a difference in this pecking order in future.


1) There were many who had not put any information online about their education background. This lack of data can account for some inaccuracy in the result.
2) 7 companies and an overall sample set of 200 business graduates is a very small number
3) It was observed that MBA graduates formed only 20-30% of the total set. These observations are within this subset of MBA graduates.
4) There is a chance that wrong entries might be picked up. But this will become nominal as number of entries increase.

For example, an article by LiveMint showed that IIMA produced the highest number of CEOs in BSE 500 companies. It is slightly surprising since that trend is not clearly visible here.

Tech Corner:

It was the first time I extensively worked on python building a scraper and absolutely loved the Selenium driver. It makes it so much easier to build web bots.

If you want the actual script, do leave a comment and I will get back to you.  :)

Important Links 

1) Section 197 Companies Act
2) HUL Annexure report
3) Selenium-python docs

Sunday, March 13, 2016

Ddos Attacks and mitigation,an account from the practical world

Before I delve into the anti DDoS methodologies involved let me explain what a DdoS is and how it can impact you-the customer.

Denial Of Service (DoS)

DoS stands for Denial of Service which involves bad guys (known as hackers) sending so much garbage data directing to  the customer's site that it's performance starts getting affected. In other words a hacker sends so much garbage requests that your product site just does not have enough resources to serve genuine users.
A DdoS stands for Distributed DoS which is basically a hacker conducting a DoS attack from multiple locations simultaneously making it even more difficult to comprehend and block such users.

How does it impact you?

It has been reported that almost 72% (yes almost three quarters) of servers serving an IT product get Ddos'ed! That means if your domain is not having a DDoS mitigation policy, your business will get be impacted. Not only do you lose money but your brand reputation also gets affected.

A Ddos atrack is when an evil user who hates see you flourish decides to send huge amount of packets to your account. This could start using up the service provider's internet bandwidth or start using resources on the server hosting your domain.

Generally tools such as Cacti (rrdtools) and Nfsen are used to measure incoming and outgoing bandwidth on and the nature of traffic we receive (is it website based or dns). There are tools which can be used to detect such attacks and take preventive action.
An example of a spike in traffic

False alarms can be difficult to identify

Server side monitoring

On the server side one can setup monitoring tools which measure crucial parameters like cpu usage, bandwidth usage, number of processes and threads running.
When an attacker sends a lot of junk data to your site, your site's network will suddenly see a spike in traffic and the bandwidth consumed increases.
Generally the NOC or SOC is quick to detect this increase via alerts or graphs.
In some complex attacks there might not be an increase in bandwidth consumed but a surge in the number of packet arriving per second dramatically increases.

Cpu and other metrics of a server being measured by Graphite

After diagnosing the incident your hosting provider can employ a BGP announcement technique to mitigate the attack.
By changing the BGP announcement the hosting provider tells the whole internet that the best route to them is via a mitigation provider (Prolexic is a popular service).

Now the entire internet thinks that your hosting is via the mitigation provider and starts sending the entire traffic to them.

Such mitigation centers have enough bandwidth and devices to analyze the traffic and apply suitable filters to allow only clean traffic to pass through, thus thwarting a DdoS attack.

For the technically savvy reader, note that even though incoming traffic comes via the mitigation provider, the outgoing traffic (traffic which leaves the server, towards a customer) goes via the normal ISP link.

Further analysis:

A DDoS attack is hugely effective when the illegitimate traffic starts to congest or choke the bandwidth. In such a scenario all the servers within that datacenter (a central place where many servers are packed) are affected. So even if you're domain is not to being attacked but someone else's domain is, than your services get impacted.
To ensure this does not happen companies usually deploy multiple links with large bandwidth.

Important Links:

1) Impact of DdoS
2) Wiki on DDoS
3) Prolexic Mitigation technique