Validating BGP Announcements by Automating Filter Generation with Python: Part2

At first part of the post, i wrote that some tools use “whois” client to query RIR databases. So it is possible to use a “whois cli tool” to gather the necessary data of your interest.

Whois

You can find whois flags for RIPE on http://www.ripe.net/data-tools/support/documentation/queries-ref-card to elaborate the whois queries. Lets pull routes of an AS34984 in Linux command Line:

root@test:~# whois -h whois.ripe.net -i or as34984 | grep route:
route:          151.250.0.0/16
…output omitted

In order to get route data as a variable, to play with, you still need to need to parse the related part of the data from the output. A better approach, which i generally prefer , is to use a native library, so that your script won’t depend on a tool, as a result it would be executable both on Windows, or or Linux/Mac. For a native client library check https://pypi.python.org/pypi/WhoisClient

Restful Web Services API

Even with a native library, you will still need to parse the output data. There is a better way, which is getting the data in a format like XML, YAML, JSON etc so that it is easier to get related parts systematically. Fortunately RIPE/ARIN has a RESTFul Web Services API which is a REST interface to their WhoIS Database which are invoked via http requests. Here are the documentation links:

RIPE: https://labs.ripe.net/ripe-database/database-api/api-documentation https://labs.ripe.net/ripe-database/database-api/api-introduction

For ARIN documentation check:

ARIN: https://www.arin.net/resources/whoisrws/whois_api.html#whoisrws https://www.nanog.org/meetings/nanog48/presentations/Wednesday/Kosters_Update_N48.pdf

Querying RIPE Web API and Processing XML Input With Python

By looking at the documentation, it can be seen that for AS5400, which is BritishTelecom, url

http://rest.db.ripe.net/search.xml?query-string=as5400&inverse-attribute=origin could be used to get all routes that has an origin of AS5400. The ony thing to get prefixes for another AS is to change the “as5400” value in the url itelf. If you open the url in your web browser you will see an xml output like below:

So if we could open this url in python, and get the output, then parse it as xml, everything will be ok. To do so, lets examine the code i wrote, part by part:

Examining the code:

At first part of the code we need to import libraries, which are: urllib, urllib2 and xml.etree.ElementTree (you may also use lxml instead of xml.etree.ElementTree)

import urllib,urllib2
try:
    import xml.etree.cElementTree as ET
except ImportError:
    import xml.etree.ElementTree as ET

We determined how to create a URL to pull prefixes that is originated from an AS. Creating an AS variable is logical so that in further usages you can change the AS number or get the value as an input.

as_to_pull_prefixes = "AS5400"
url = "http://rest.db.ripe.net/search.xml?query-string=%s&inverse-attribute=origin" % as_to_pull_prefixes

As url is ready we need to open the url and assign the output to a string variable:

fp = urllib2.urlopen (url)
response = fp.read()

Then we need to parse XML from string into an element:

tree = ET.fromstring(response)
fp.close()

Last part is to get the prefix values from the xml output, which we turned into an element by using xml.etree.ElementTree. First we need to determine the hierarchy so that we can write a path argument. Having a look at the xml output, the hierarchy should be:

inside <objects>
inside <object> that has a type value of "route" (for ipv4 prefixes)
inside <primary-key>
every <attribute> which has a name value of "route"

To write such a path argument, you may wanted to have a look at https://docs.python.org/2/library/xml.etree.elementtree.html#elementtree-xpath. Here is the code:

interested =  tree.findall("./objects/object[@type='route']/primary-key/attribute[@name='route']")

Our job is still not done yet, cause inside the tree we created, which is "interested" variable, we need to take the values of "value": Here is the last piece of the code:

for child in interested:
    print child.get('value')

Trying It Out

Here is the output when i executed the script:

To download the script, click here, or copy-paste the full code below:

import urllib,urllib2
try:
    import xml.etree.cElementTree as ET
except ImportError:
    import xml.etree.ElementTree as ET
###
###Variables which changes per request
as_to_pull_prefixes = "AS5400"
url = "http://rest.db.ripe.net/search.xml?query-string=%s&inverse-attribute=origin" % as_to_pull_prefixes
###
###Pull Info From IRR(RIPE)and assign it to a variable as string
fp = urllib2.urlopen (url)
response = fp.read()
###
###parse xml from  from string into an element 
tree = ET.fromstring(response)
fp.close()
###
###get interested data from element 
interested =  tree.findall("./objects/object[@type='route']/primary-key/attribute[@name='route']")
for child in interested:
    print child.get('value')
###############
##for more info on how to select interested data from xml
##https://docs.python.org/2/library/xml.etree.elementtree.html#elementtree-xpath
###############

1 comments:

Unknown said...: Well ! I would like to add another way of python script to validate whois records:

FOR EXAMPLE :

import whois
w = whois.whois('webscraping.com')
w.expiration_date # dates converted to datetime object
datetime.datetime(2013, 6, 26, 0, 0)
w.text
print w
SAMPLE OUTPUT :
creation_date: 2004-06-26 00:00:00
domain_name: [u'WEBSCRAPING.COM', u'WEBSCRAPING.COM']
emails: [u'WEBSCRAPING.COM@domainsbyproxy.com', u'WEBSCRAPING.COM@domainsbyproxy.com']
expiration_date: 2013-06-26 00:00:00

To know the whois details of a particular domain owner without using any script, use WhoisXY.com; January 27, 2015 at 9:49 PM

Monday, January 26, 2015