View Full Version : Link Extractor in Python

11-09-2010, 02:35 PM
This is my first python code .Feedback on improving it would be great.

#A small link extractor program .
import os,sys,urllib,re,httplib
if len(sys.argv) != 2:
print "\n|-----------------------------------------------------------------|"
print "| lastman100[@]gmail[dot]com |"
print "| 10/2010 Link Extractor v0.1 |"
print "| Visit : www.garage4hackers.com |"
print "|-----------------------------------------------------------------|\n"

ab=raw_input("enter URL to extract the link\n")
if ht.search(ab):



y =link.finditer(st)

for i in y:
print i.group()

11-09-2010, 03:00 PM
Nice Start bro also we are planning to implement SVN were we can host all the tools there and keep track on changes and Updates

11-09-2010, 03:34 PM
thanks for the encouragement bro . I am planning to improve it further and add new features as my knowledge of this awesome language improves :)

11-09-2010, 06:38 PM
Awesome start bro !!!! Good to see..I m using this tool :)

11-09-2010, 07:02 PM
Impressive and am encouraged to see such stuff.....I believe we have enuf tools to populate our new tool sections.

11-10-2010, 12:10 AM
thanks for the encouragement prashant and Anarki bro . It means a lot to me :)

11-10-2010, 09:50 AM
Working great ! :)


11-10-2010, 01:09 PM
Awesome bro !!!!!!!!!!!!!

11-10-2010, 06:30 PM
yeah... i see now that Darkest is really encourage and enlighten because of one night stay with FB1. Hope to see you rocking this stream bro.. keep it up.

And we shall seriously start keeping proper track of GARAGE developments.

11-12-2010, 12:34 PM
Well the code is fine, just as programmer some comments
NOTE: These comments are for good programming if you want to create just dirty script then do not read ahead. :D

use the re / re.compile as less as possible since as your code lines will increase it will create optimization problems.
So the first IF

if ht.search(ab):

Could be replaced with

if ab.startswith("http://")

Further more the Regular Expression is not that perfect ...
give test case as

st = "www.com www..com www,yahoo,com mail?yahoo?com"

it will recognise all these as proper urls.

Regular Expression needs to be improved.

11-12-2010, 04:27 PM
Thanks for the feeds neo .

i will try to incorporate these ideas in my next code .

11-13-2010, 01:02 AM
@the-empty its all his work, he been working on python and I never helped him in any way its his work :) Good job darkest but I wann see the code full , I remember how bond used to scold me because how impatient I was, "which I am still :P". , now u are all the same here, just put up the rest of your ideas and make it huge :D