home

author: niplav, created: 2019-07-13, modified: 2020-04-08, language: english, status: in progress, importance: 5, confidence: log

The blog Shtetl Optimized is written by Scott Aaronson, and deals with quantum computing and theoretical computer science. However, its archives are clumsy to navigate chronologically. This pages makes that easier.

Shtetl Optimized Posts Chronological Index

This index currently lists 784 posts from 2006-11-20 until 2019-12-28.

Archives

2005

2006

2007

2008

2009

2010

2011

2012

2013

2014

2015

2016

2017

2018

2019

2020

Code

The site was scraped using Python 2 with the libraries urllib2 and BeautifulSoup:

import urllib2
from bs4 import BeautifulSoup
import sys
import datetime

author='Scott Aaronson'

for year in range(2005, datetime.datetime.now().year+1):
    yearposts=[]
    for page in range(1, 100):
        url='https://www.scottaaronson.com/blog/?m={}1&paged={}'.format(year, page)
        req=urllib2.Request(url, headers={'User-Agent' : "Magic Browser"})
        try:
            con=urllib2.urlopen(req)
        except urllib2.HTTPError, e:
            break
        data=con.read()
        soup=BeautifulSoup(data, 'html.parser')
        posts=soup.find_all(class_="post")
        for p in posts:
            title=p.h3.text
            link=p.h3.a.get('href')
            date=p.small.text
            entry='* [{}]({}) ({}, {})'.format(title.encode('utf_8'), str(link), str(author), str(date))
            yearposts.append(entry)
    print('\n### {}\n'.format(year))
    for t in reversed(yearposts):
        print(t)