Toolbending:There are secret messages everywhere

A présenter

Chacun prépare 1 fichier (fichier.mp3, fichier.wav, fichier.ps, fichier.png, fichier.txt, fichier.pdf ...) pour la session prochaine:

Essayer de varier les ingrédients: textes, fonts, options, couleurs ...
Utiliser "man", "-h" ou chercher en-ligne pour trouver des options, exemples, de l'aide
Combiner plusieurs recettes en utilisant <, | et >
Documentez vos recettes ici: http://ustensile.be/index.php/Recettes !

http://ustensile.be/index.php/Toolbending:Ligne_de_commande,_Commande_de_ligne#Exp.C3.A9rimentation

Workshop: There are secret messages everywhere

We look at internet content every day with it's diverse design styles, advertising and images. But if we look examine beyond the result that we see in the browser, we can see that this is a product of structured data. Lets do a few experiments to see if we can extract specific bits of information from web sources.

cURL is a command-line swiss army knife for grabbing internet data: http://curl.haxx.se/

Try downloading some urls with cURL and examine the HTML. Also remember, you can pipe the output to a file or a command line program to do more things with it. Try using "grep" to find particular parts of the web page that are interesting.

$ curl http://apple.stackexchange.com/feeds/question/35852 | grep author

RSS and Atom feeds are special XML files that describe the recent updates to a web site. We can use them to make our own newspapers and curated collections. https://fr.wikipedia.org/wiki/RSS

We can write a short python program that will extract common details from RSS feeds by using the minidom library. http://docs.python.org/2/library/xml.dom.minidom.html

Using the DOM can be rather confusing and sometimes we need to "scrape" contents from an HTML page. The python library Beautiful Soup makes this much easier. http://www.crummy.com/software/BeautifulSoup/

Installation BeautifulSoup sur Linux

$ sudo apt-get install python-bs4

Sauvegarder un page immb dans un fichier

$ curl http://www.imdb.com/title/tt1740707/?ref_=fn_al_tt_1 > movie.html

Dans un éditeur de texte, le sauvegarder sous "movie.py"

from bs4 import BeautifulSoup
doc = open ("movie.html")
soup = BeautifulSoup(doc)
actors = soup.find_all(itemprop="name")
for actor in actors:
    print actor.text

Dans le terminal:

$ python movie.py

Brendan Howell was born in Manchester, CT, USA in 1976. He is an artist and a reluctant engineer who has created various software works and interactive electronic inventions. Currently, he lives in Berlin, Germany. He has done research and led courses at the Muthesius Kunsthochschule, Merz Akademie, Fachhochschule Potsdam and the Kunsthochschule Berlin, Weißensee.

Toolbending:There are secret messages everywhere

A présenter

Workshop: There are secret messages everywhere

Menu de navigation

Outils personnels

Espaces de noms

Variantes

Affichages

Plus

Rechercher

Cultures numériques

Écrire le design

Toolbending

Ressources

Navigation

Outils