python Archives - Page 2 of 2

22 Feb

How to get favicon.ico files from Alexa Top 1000 sites in 2 minutes with Python

Make folders:

mkdir -p favicons/icons ; cd favicons

Get a list of Alexa Top 1000 sites:

curl -s -O http://s3.amazonaws.com/alexa-static/top-1m.csv.zip ; unzip -q -o top-1m.csv.zip top-1m.csv ; head -1000 top-1m.csv | cut -d, -f2 | cut -d/ -f1 > topsites.txt

Former time-saving oneliner was found here.

yum install python-gevent

Gevent is a high-performance network framework for Python built on top of libevent and greenlets.

Few modifications of an example shipped with gevent:

#!/usr/bin/python
# Copyright (c) 2009 Denis Bilenko. See LICENSE for details.
"""Spawn multiple workers and wait for them to complete"""

ursl = []
urls = lines = ['http://www.' + line.strip() for line in open('topsites.txt')]

import gevent
from gevent import monkey

# patches stdlib (including socket and ssl modules) to cooperate with other greenlets
monkey.patch_all()

import urllib2
from socket import setdefaulttimeout
setdefaulttimeout(30)

def print_head(url):
     print ('Starting %s' % url)
     url = url + '/favicon.ico'
     try:
         data = urllib2.urlopen(url).read()
         except Exception, e:
         print 'error', url, e
         return

    fn = 'icons/' + url[+11:].replace("/", "-")
    myFile = file(fn, 'w')
    myFile.write(data)
    myFile.close()

jobs = [gevent.spawn(print_head, url) for url in urls]

gevent.joinall(jobs)

[dande@host favicons]$ time python ./get.py
...

real 0m50.644s
user 0m1.914s
sys 0m0.888s
[dande@host favicons]$

[dande@host favicons]$ ls icons/ | wc -l
889
[dande@host favicons]$

Well, there’s no much sense except fooling around with Python.

21 Feb

How to download Coursera materials with use of Python

Install coursera-dl by Dirk Gorissen:

python-pip install coursera-dl

Make a folder to store files:

mkdir -p ./courses/comnetworks-2012-001

Run:

coursera-dl -u y [email protected] -p your_password ./courses/comnetworks-2012-001 comnetworks-2012-001

Enjoy.

If you want to check if there are new materials you should run the same command. coursera-dl is smart enough to skip files you already have:

- Downloading resources for 2-6 Link Layer Overview (0414)
- "2-readings.pdf" already exists, skipping
- "2-6-link-overview-ink.pdf" already exists, skipping
- "2 - 6 - 2-6 Link Layer Overview (0414).txt" already exists, skipping
- "2 - 6 - 2-6 Link Layer Overview (0414).srt" already exists, skipping
- "2 - 6 - 2-6 Link Layer Overview (0414).mp4" already exists, skipping

20 Feb

Watching specified files/folders for changes in Python

For specific purposes there could be a need to monitor file and folders changes on Linux box. To achieve this you can go with incrond. There’s also Pythonic way. Several Python wrappers on inotify feature are accessible. Here we’ll cover simple Python daemon Watcher (github repo). First of all, we need to install python-inotify package:

yum install python-inotify.noarch

python-inotify uses Linux kernel feature called inotify (accessible starting from version 2.6.13). It allows to get notifications on file system event from user-space.

Now you can download last version of the config and the daemon:

mkdir watcher
cd watcher
wget https://raw.github.com/splitbrain/Watcher/master/watcher.ini
wget https://raw.github.com/splitbrain/Watcher/master/watcher.py

Modify your watcher.ini to meet your requirements:

[DEFAULT]
logfile=/tmp/watcher.log
pidfile=/tmp/watcher.pid
[job1]
watch=/tmp
events=create,delete
recursive=false
autoadd=true
command=ls -l $filename

Now you are ready to start Watcher daemon:

chmod u+x watcher.py
./watcher.py -c watcher.ini debug

19 Feb

10 Minutes Celery Introduction

Celery is an asynchronous task queue/job queue based on distributed message passing. This post is not detailed introduction but rather a short how-to start using Celery.

Using Celery supposes having of several components. It’s a:

broker. Think it as a transport. You can choose among RabbitMQ, Redis or SQL servers;
worker application which executes task;
client application which should add tasks to the queue.

Let’s get started. At the very beginning there’s a need to install Celery. I run Fedora server. If you use Debian use apt-get.

yum install python-celery.noarch

For the sake of simplicity we’ll use Redis as a broker. It’s fast, simple to setup and doesn’t consume a lot of resources.

yum install redis

Now we can tune some options. Here’s redis.conf example:

daemonize no
pidfile /var/run/redis/redis.pid
port 6379
bind 127.0.0.1
timeout 0
loglevel notice
logfile /var/log/redis/redis.log
databases 16
save 900 1
save 300 10
save 60 10000
rdbcompression yes
dbfilename dump.rdb
dir /var/lib/redis/
slave-serve-stale-data yes
appendonly no
appendfsync everysec
no-appendfsync-on-rewrite no
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
slowlog-log-slower-than 10000
slowlog-max-len 128
vm-enabled no
vm-swap-file /tmp/redis.swap
vm-max-memory 0
vm-page-size 32
vm-pages 134217728
vm-max-threads 4
hash-max-zipmap-entries 512
hash-max-zipmap-value 64
list-max-ziplist-entries 512
list-max-ziplist-value 64
set-max-intset-entries 512
zset-max-ziplist-entries 128
zset-max-ziplist-value 64
activerehashing yes

We will also need celery-with-redis package which Celery requires to work with Redis:

python-pip install -U celery-with-redis

Keep in mind that this command would also update your current Celery installation with its dependencies. It’s not big deal, but you might need to know.

Now let’s create our worker application called tasks.py:

from celery import Celery
celery = Celery('tasks', broker='redis://localhost:6379/0', backend='redis://localhost:6379/1')

@celery.task
def add(x, y):
    return x + y

Now we can launch it:

celery -A tasks worker --loglevel=info

You should get similar output:

 
-------------- celery@turtle v3.0.15 (Chiastic Slide)
---- **** -----
--- * *** * -- [Configuration]
-- * - **** --- . broker: redis://localhost:6379/0
- ** ---------- . app: tasks:0x1d1b690
- ** ---------- . concurrency: 1 (processes)
- ** ---------- . events: OFF (enable -E to monitor this worker)
- ** ----------
- *** --- * --- [Queues]
-- ******* ---- . celery: exchange:celery(direct) binding:celery
--- ***** -----

[Tasks]
. tasks.add

[2013-02-19 23:52:42,339: WARNING/MainProcess] celery@turtle ready.
[2013-02-19 23:52:42,361: INFO/MainProcess] consumer: Connected to redis://localhost:6379/0.

Here’s our client application:

from tasks import add
result = add.delay(4, 4)
print result.get(timeout=1)

Note that here we use Celeray in synchronous mode. It means that we wait till the result is ready. I believe in most cases one would use Celery in asynchronous mode. Here we use it just to get a result to make sure everything works.

Output:

[dande@turtle ~]# python client.py
8
[dande@turtle ~]#

Now as everything is ready we can start to think about what we can do with described solution.

By the way, if you are interested in how Celery uses Redis run:

redis-cli monitor

07 Feb

trac deployment with gunicorn and systemd in Fedora 17

gunicorn installation:

yum install python-gunicorn.noarch

Put systemd unit file to /lib/systemd/system/gunicorn-trac.service:

[Unit]
Description=gunicorn-trac

[Service]
ExecStart=/usr/bin/gunicorn -D -n gunicorn-trac -w5 tracwsgi:application -b 127.0.0.1:8000 --access-logfile /home/trac/log/access.log --error-logfile /home/trac/log/error.log
Type=forking
User=trac
Group=trac
Restart=always
StandardOutput=syslog
StandardError=syslog
WorkingDirectory = /home/trac/

[Install]
WantedBy=multi-user.target

Enabling, starting:

systemctl enable gunicorn-trac
systemctl start gunicorn-trac

Checking:

[root@moonstation ~]# netstat -lpn | grep gun
tcp 0 0 127.0.0.1:8000 0.0.0.0:* LISTEN 1034/gunicorn: maste
[root@moonstation ~]#

Everything seems fine. Now we can proceed with nginx setup as frontend to trac.

01 Feb

5 Free Python Books For the Beginners

Well, everybody knows: the best way to learn programming is to program. Here are few freely accessible books on Python which should allow you to start.

Python Tutorial
Byte of Python by Swaroop C H
Think Python. How to Think Like a Computer Scientist by Allen B. Downey
Dive Into Python by Mark Pilgrim
Learn Python The Hard Way by Zed. A. Shaw

01 Feb

trac deployment under nginx on Centos 6

Here’s is an example of how to deploy trac under nginx. I assume that you already have trac installed.

server {
    listen 192.168.1.1:80;
    server_name trac.example.com www.trac.example.com default;

    location /chrome/common/ {
         alias /usr/lib/python2.6/site-packages/trac/htdocs/;
         expires 1M;
         add_header Cache-Control private;
         gzip_static on;
         gzip_disable Firefox/([0-2]\.|3\.0);
         gzip_disable Chrome/2;
         gzip_disable Safari;
    }
    location / {
        auth_basic            "Authorized area";
        auth_basic_user_file  /home/trac/.passwords;

        proxy_pass  http://127.0.0.1:8000;
        proxy_set_header REMOTE_USER $remote_user;
    }
}

trac is being launched this way:

/usr/bin/python /usr/sbin/tracd --daemonize --pidfile=/tmp/tracd.pid --port=8000 --protocol=http --single-env /home/trac -b 127.0.0.1 --basic-auth==/home/trac,/home/trac/.passwords,example.com

To get the authorization working you should also have this parameter in your trac.ini file:

obey_remote_user_header = true

01 Feb

How to modify a variable inside of function

Code example

class z():
    def __init__(self):
        self.z = ['foo']
        print 'before', self.z

    def zoo(self, doo):
        doo[0] = 'ya'

class b():
    def __init__(self):
        self.z = 'foo'
        print 'before', self.z

    def zoo(self, doo):
        doo = 'ya'

A = z()
A.zoo(A.z)
print 'after', A.z

print

B = b()
B.zoo(B.z)
print 'after', B.z

Output

[dandelion@bart ~]$ python z.py
before ['foo']
after ['ya']

before foo
after foo
[dandelion@bart ~]$

Explanation

It’s simple. In Python ‘string’ is immutable object, while [list] is mutable one.

31 Jan

Hello world!

Hi!

This is our first post. pydelion.com is a blog on Python Programming Language. Here you’ll find news, reviews, snippets, etc. from Python programming world. Hope you’ll enjoy it!

See you.

pydelion.com

date --date="5 min ago" +%d/%b/%Y:%H:%M

Category Archives: python

How to get favicon.ico files from Alexa Top 1000 sites in 2 minutes with Python

How to download Coursera materials with use of Python

Watching specified files/folders for changes in Python

10 Minutes Celery Introduction

trac deployment with gunicorn and systemd in Fedora 17

5 Free Python Books For the Beginners

trac deployment under nginx on Centos 6

How to modify a variable inside of function

Hello world!