mek (Michael Karpeles) shares his work on openlibrary.org with the Internet Archive (archive.org), how Aaron Swartz influenced him, and tips on productivity (mek.fyi). If you're a developer or lover of libraries, volunteers are welcome.
Links and Resources Mentioned:
mek: 00:00 It bothers me so much that 71% of the world is living on less than $10 a day. And these people don't have equal opportunity to access academic papers or books. Imagine all of the areas in rural China or in India or in Africa where they don't have a library system, and something like Open Library is their only chance of getting competitive access to high class books.
Gavin: 00:27 Welcome to the What Origin? podcast. Join the conversation at facebook.com/whatorigin, and find out more at whatorigin.com. Today, we are interviewing Michael Karpeles, who goes by MEK and works on Open Library, the world’s library, which is part of Internet Archive. MEK keeps notes on his life at mek.fyi, so feel free to check that out to find out more about him. I wanted to mention that one of MEK’s big influences was Aaron Swartz, who unfortunately took his life. So Episode 3 talks about mental health and depression, so if you are going through a rough time, feel free to take a listen. And at the least, talk to your friends and reach out before you do anything drastic.
Gavin: 01:16 So let’s learn about Open Library and Internet Archive, with MEK. And without further ado, let’s start the interview.
Gavin: 01:24 How can the audience help get involved or what opportunities exist at Archive and Open Library?
mek: 01:30 Yeah. Hey Gavin. Thanks so much for having me with you. The Internet Archive only has around 40 engineers, and we span a bunch of different projects, from recording television news; we have over a million hours of television news that's been recorded and uploaded to the archive. We have close to 4 million books that you could read or borrow directly from your browser. We also have things like academic papers, and one of our biggest challenges is we span so many different media types that it's hard to get a critical mass of engineers on our various projects. So for anyone who has used Open Library before, or openlibrary.org, we are in the online digital library with around 4 million books that are available internationally for people to read or borrow. And I think a lot of people would be surprised to learn that the entire platform, which serves around 2 million users internationally, 2 million patrons, is ostensibly run by me, a group of volunteers, and one of my teammates, Charles Horn.
mek: 02:33 So the thing that we really need are people who are passionate about library systems, who want to play a role in running the world's greatest open source library and really developing a library which can serve the people, and work for the people. So we have around 45 volunteers right now, who kind of work with us on our Slack channels and join our weekly community calls which are 11:30 AM PT. And if you're interested in design or library metadata and cataloging, or you just want to improve the experience and fix bugs for other users, or have a say in what your library looks like or works like, then we would love to have you with us.
Gavin: 03:14 So where would you send folks? Is there an email or a contact form that you would tell folks to go to if they are interested?
mek: 03:22 I think a great first place to start is our GitHub. That's github.com/internetarchive/openlibrary. And if you do a Google search for Open Library GitHub, or if you go to the openlibrary.org website and you scroll to the footer, you should see an entry point to get to our GitHub. We try to make our entire culture open both in terms of open source and then also in terms of our meetings being public, our wiki and our documentation being public, our issues being public. So starting off with the GitHub and then after you've contributed a little bit and have the project built in Docker, then we usually invite people to Slack and we have a more vibrant, intimate community on our Slack channel.
Gavin: 04:07 And one specific question about Open Library is I notice some things are borrow, some things are just view, at the moment. So how do you get past the legality of sharing a book?
mek: 04:17 So the way that Open Library works is ever so slightly different from your traditional brick and mortar library, only because we're a digital library and the thing that people are reading or the thing that people are borrowing are digital copies of physical books that we have at our warehouse. So effectively, the Open Library and archive.org are behaving as a California State accredited library. And it just so happens that our model is slightly different. But every time you go to the website, you're either reading a book which is in the public domain, or is unrestricted due to policy affordances, or you're reading a title from our lending program. And when we lend out a copy of a book, we're actually doing a one-to-one lend against a physical copy of the same book that we've digitized that lives in one of our warehouses. And so if you imagine, if we have over a million lendable books on Open Library, then we have a million books that we've either purchased in order to support authors, or that have been donated to us, that rest safely in our warehouse that we lend against.
Gavin: 05:27 By digitized, do you mean you have a dedicated group that has scanned books?
mek: 05:32 That's right. So the Internet Archive is a nonprofit, and we work a little bit differently than a lot of other physical libraries because we employ with full benefits a team of scanning operators who literally flip through the pages of a book and digitize each page with high quality cameras. And we try to hit different levels of FADGI compliance for the quality of the scans, and then these end up on archive.org for our readers to enjoy.
Gavin: 06:04 How did you get involved with Archive? And were there people that influenced you to go towards sort of like a “social good” project?
mek: 06:12 I think I've always been drawn to the idea of a public good mission. I'm really lucky in that, I was raised with pretty much every privilege at my disposal. I happen to be a white middle class male, you know, currently living in San Francisco. And I feel like a lot of people who are trying to figure out what to do with their life, aren't really considering, like “what does my privilege afforded me to do?” There are a lot of people who are working multiple jobs, and they don't have a security blanket. There's nowhere they can really fall. And for me, I felt like there was a lot I could afford to lose and still be in really good shape. And that seemed like something really important to use to my advantage.
mek: 06:56 So I guess my story started in grad school, at the University of Delaware, where I was studying things that are kind of similar to Google Search. It was called Natural Language Processing or Computational Linguistics. And I had a friend from undergrad named Stephen Balaban, who Gavin, I believe you know as well, who had convinced me to start a company with him. And I left grad school and I spent a few years in this mode of just trying to be an entrepreneur and figuring out what it was all about. And at the time I had a very specific mission, which was curating a living map of the world's knowledge. I didn't know exactly what I wanted to do, but I figured if I could find ways to help people access information more quickly and more effectively, then wouldn't that make everyone better?
mek: 07:43 And what I started to realize as I went from one startup to the next startup was that the people who I was really helping, were privileged people who are in a very similar situation to me. It's not like they were struggling paycheck to paycheck, or trying to figure out how they eat, or ... I mean homelessness is a huge problem in San Francisco. It's not like any of these people were worrying about where they were going to sleep that night. And there was one figure in particular who happened to build a lot of the technologies that I used when I was in startup land, named Aaron Swartz. And Aaron had built a technology called WebPy, which is, for the more technical folks, it was this Python web framework that allowed you to create websites really quickly, to power your startup ideas.
mek: 08:30 And I really admired Aaron for a bunch of reasons. Not only was he a great technologist, but the more I became involved in this webPy community, I started to learn that Aaron was an activist for net neutrality. That is this idea that everyone should have the right to a voice and censorship-free internet. He also advocated for access to knowledge in general and for a rigorous scientific process that I really applauded. And a lot of people don't know that Aaron actually had worked with and at the Internet Archive. And around 2010, I had discovered this project called Open Library that Aaron Swartz had actually started. So he was the founder of that project and joined forces with this guy, Brewster Kahle, who is the founder and digital librarian at the Internet Archive.
mek: 09:24 And so they kind of collaborated on this project. And Aaron's original idea for the project was, maybe Open Library could be kind of like Wikipedia, but a catalog or a webpage for every book in the world. If there's a book out there, it doesn't matter if it was published in 1920 or in the year 2000, there should be this wiki page on Open Library. And I think the genius thing that happened was that by aligning and partnering with the Internet Archive, a lot of the books that Aaron was cataloging with the community suddenly became readable or borrowable, because the Internet archive had a copy, and they had this special nonprofit status.
mek: 10:03 So I remember being on a phone call with Aaron and saying, "Man, Aaron, I really love your Open Library project. But why isn't there anything like that for academic papers? Why is it that there's no catalog for searching for an academic paper and finding at least an abstract." I think since then we've come a long way. There's arXiv with an x, that's Cornell's repository of papers. There's also CORE and BASE and Directory of Open Access Journals. A lot of projects have come up. But at the time there really wasn't a community effort to create something like Open Library for papers, and to make academic papers more accessible internationally, especially to the communities who don't have access to it.
mek: 10:49 And at the time, I did not know that Aaron was currently facing an indictment for attempting to access papers through JSTOR, in a way that maybe JSTOR or MIT didn't approve of. And what he said is, "Hey, I don't really want to talk about this." And at the time I felt kind of hurt. Almost to my shame, I started working on this side project called Open Journal. And Open Journal was an attempt to be kind of like Open Library, but for academic papers. And shortly thereafter, it hit the news that Aaron was facing all of these complications, and unfortunately the weight of the whole situation became so heavy that he ended up taking his life. And I feel like that really left a big gap in my life. He was such a huge champion of my values. And I started to wonder ... at the time I was running a startup, which wasn't very satisfying, but it was paying the bills. And I started to wonder, "Well, how can I run this startup, but on the side maybe volunteer somewhere and continue some of Aaron's projects?"
mek: 11:53 And throughout a lot of 2015, basically the whole year, I worked for free just volunteering with the Internet Archive and with the Open Library project. And that's how I ended up ... It kind of became decided for me that the Internet Archive’s mission of universal access to all knowledge, and the legacy set by both Brewster and also by Aaron, really became my calling. And I'm very privileged and lucky to be continuing some of these projects.
Gavin: 12:20 So, what advice would you have for someone that feels overwhelmed: “Okay, I'm starting here and how do I get to my goal?” Just in any type of situation, what's the daily sort of grind that keeps you going and you seeing that each day builds towards a larger thing?
mek: 12:41 You know, I feel like a lot of people have written or attempted to write books on the topic of productivity or how to stay focused. And honestly, I don't think I'm the most successful person in the world, certainly. Probably like Jeff Bezos or someone like that would have a good book on how to be productive. I think two things that stick out for me, or maybe three, one of them is just you have to be passionate about the problem that you're working on. Because what ends up happening for a lot of people like me is, it's not rational. It's not just that we want to get good at something and we spend a lot of time. It's almost, I hate to say it, a little bit more like a sickness. Like, it bothers me so much that 71% of the world is living on less than $10 a day. And these people don't have equal opportunity to access academic papers or books.
mek: 13:36 Imagine all of the areas in rural China or in India or in Africa where they don't have a library system, and something like Open Library is their only chance of getting competitive access to like high class books. So a part of it is just having a passion and having that passion keep you up at night. And, I spent tons of my weekends working on Open Library and engaging with the community and working after hours, partially because I love the work and I love being at a nonprofit, but also just because I really need to see the problem solved. It makes me sick when I see people whose productivity are wasted and they want to do good work.
mek: 14:22 I think the second thing is Aaron Swartz had a wonderful essay on how to be productive. And when I was first stumbling across this essay, I was thinking, “Great. Here's a young man who was super productive. He probably has tons of tips and tricks on how to squeeze more out of every day.” And a part of me was disappointed when I was reading the essay because it wasn't about how to squeeze more out of the day. It was about how to think differently about procrastination. And I think when people get tired or they get worn out, some people will watch TV or they'll consume some media, or just browse Facebook.
mek: 15:06 And I think what Aaron's essay is all about, is recognizing different qualities of time, and noticing, "Hey, right now I'm not on my A game. I don't feel like I can program some world class algorithm. But I can definitely answer emails." And recognizing that during this quality of time, answering emails might have a lot more benefit for the people that I care about, than maybe surfing Facebook. And it doesn't have to be that example, but the notion of evaluating and understanding what time you're facing and making the most of that, I think, is really important to being productive.
mek: 15:43 And then I think the final of the three points is ... and this I kind of learned firsthand from reading Benjamin Franklin's autobiography ... is having a rubric for what you want to accomplish and putting things on a calendar and being rigorous about to do lists and tracking things is a wonderful way to ensure that you make progress. It's so much easier just to do the next thing than have to constantly context switch, and be scheduling what to do next. And so for me, I have a Google Sheet, like a spreadsheet, that I make public online. Pretty much everything I talk about, there's some resource, or essay on my website, mek.fyi, including the spreadsheet. But the spreadsheet pretends that maybe I'm like a Superman or I’m a Superwoman. And it has a bunch of columns, like how much exercise am I getting? How many minutes of philanthropy and helping people am I participating in per day? Am I doing daily goals? Am I writing what my goals are for the day? How much sleep am I getting?
mek: 16:53 And each column has a weight associated with it. For instance, for me, I tried to minimize the amount of media that I consume, and so media has a negative weight. And whereas exercise has a positive weight, and they all have different scores and what I do is sum up each column multiplied by the weight and that gives me a score for every day. And I don't try to really optimize what happens every day. And I'm not really religious about this spreadsheet. What I think it does is it gives me feedback. It lets me know, "Oh shoot, I haven't played music in two weeks. And that's something that's important to me." And I think having that type of pulse is really important for accountability and for one's productivity.
mek: 17:36 And if I had to add something onto that, I would say just surround yourself by people who are similarly going to hold you accountable and who are going to push you. If you are consuming media and they know that's not something you want to do, the best thing they can do is say, "Hey, maybe there's something else you could be doing with your time right now." I think the combination of those four things have been the best recipe I've been able to find. Mostly the passion and having good friends to keep you on track. But I'm sure someone like Jeff Bezos would have a much better idea about what works for them.
Gavin: 18:10 Thank you for listening to this episode of What Origin?. You can find out more about MEK at mek.fyi, as he mentioned. There’s a lot of resources and interesting information. And check out whatorigin.com to keep up to date with our latest episodes. We’re available on all your favorite platforms, and we also have transcriptions on our website.