Culling captions from YouTube

Note: This script is basically no longer necessary since Youtube surfaces the captions. Under the video, to the right of the ‘save’ link is an ellipses (three periods). If you click on that, you get a submenu that reads ‘open transcript’. This is a much simpler way of getting transcripts than this script ;)

If you want to grab Youtube auto-generated captions, here is a better script:

var captionCollector = {
    captions : '',

    collect : function(){
      try {
        var currentCaption = document.getElementsByClassName("captions-text")[0].innerText.replace(/\n/,'');
      } catch (e) {


    if(currentCaption) {
      if(captionCollector.captions.lastIndexOf(currentCaption) == -1) {
        captionCollector.captions += currentCaption;

      setTimeout(captionCollector.collect, 500);


Then calle captionCollector.captions to print the captions to the console.

The 6 best new features in npm 5

Last night I attended waffle.js and heard a great talk by Laurie Voss, aka seldo, the CTO of npm, about new features in npm@5. Several things have been upgraded and new features have been added in npm@5, so, if you haven’t upgraded yet, you’re missing out.

Here are a few things were upgraded in npm@5:

1 ) npm@5 is fast

npm@5 is much faster than previous versions of npm, and as easy to install as:

npm i npm -g

npm@5 is ten times faster than npm@4.
npm@4 is ten times faster than npm@3.
npm@3 is ten times faster than npm@2.
npm@2 is ten times faster than npm@1.

What does this mean?

10 x 10 x 10 x 10 = 104 = 10,000

It means the current version of npm is 10,000 faster than v1. Yes, I am exaggerating, but just a bit. Believe me, npm@5 is super fast.

2) npm@5 includes package-lock.json by default

Package lock makes npm faster because it doesn’t have to look up all the packages: it knows to use the ones you locked in. In addition, with locked packages, you get the same package versions in production as you do in development. This makes builds more reliable. Gone are the days of trying to figure out a bug only to discover it came from others running different versions of a library.

This does change how updates work. You will no longer automatically get the latest updates. Rather, you now need to explicitly update your packages to semver compatible package versions.

npx npm-check -u

The above snippet will make you go thru each npm update, allowing you to accept or decline breaking changes. npx is introduced in #6 below.

3) npm@5 saves by default

The risk of accidentally deleting a package is gone. You no longer have to use --save. If you install a package it will be there. If you don’t want the package anymore, you can delete it.

If you don’t want to save, you can override the default saving with --no-save.

4) Improved cache

Unbeknownst to many, npm has always had a cache — everything downloaded was stored locally — but not a very good one.

The cache improvements in npm@5 include how corrupt caches are handled: cache --clear has been made obsolete as with npm@5, if the cache is corrupted it will not be used.

5) npm@5 works offline

npm@5 auto detects when you are offline. When offline, npm will install using the local, improved, non-corrupted cache, noted in #4 above. You can force npm to run offline when you are actually online, by using --offline, This is handy for less than optimal connections, like when you’re whining about airplane, Starbucks, or conference wifi, and for those on metered data plans.

6) npx package runner

The new npx package runner will download, install, execute, and clean itself up. The order will be executed either locally or from a cache,  installing any dependencies needed. If a full package specifier is included,  npx will use a freshly-installed, temporary version of the package. If no –package option is specified, npx can guess the name of the binary to invoke depending on the specifier provided.

As an example, if you want to create a React app, simply run:

$ npx create-react-app
$ npm start

This will create a working React app.

The npx package runner is probably the most exciting of the six new features. Laurie’s talk covered the syntax for creating working apps for various libraries.

Future feature: cipm

cipm, the continuous integration package manager currently under development, installs npm dependencies from a package lock. Similar in usage to npm installcipm removes node_modules before beginning the install. It works with npm@5’s lock file.  cipm is useful in continuous integration environments requiring regular, full installs of apps that can cache package data in a central cache.

You can take a look at Play with it, but don’t put it into production. Yet.

Web Performance: Video Optimization

According to HTTPArchive, sites went from an average of 2,135Kb  to 3,034 KB in the two years from July 1, 2015, to  July 1, 2017. Videos are a major part of that. The average web sites video weight grew from  204 Kb to 729 Kb over the same two year period. The “Low hanging fruit” of performance used to be optimizing images. Now it’s both images and video.

My optimizing video rules:

  1. If possible, omit videos
  2. Compress all videos
  3. Optimize <source> order
  4. Remove audio from muted heroes

If possible, omit videos

The best way to optimize is to remove unneeded content and un-needed requests.

Do you really need a hero video? Do you really need it on the mobile version of your site?

You can use media queries to avoid downloading the #hero-video on narrow screens.

@media screen and (max-width: 650px) { 
  #hero-video { 
      display: none; 

Compress all videos

Most video compression efforts involve comparing adjacent frames within a video and removing details that are the same in the original and subsequent frame. You want to both compress the video and export it to multiple video formats, including WebM, MPEG-4/H.264 and Ogg/Theora.

The software you used to create your video likely includes the ability to optimize the file size down. If not, there are several online tools, like FFmpeg, discussed below, that can help encode, decode, convert and perform other forms of magic.

Optimize <source> order

Order from smallest to largest.  For example, given three video compressions at 10MB, 12MB, and 13MB, but the smallest first and the largest last:

<video width="400" height="300" controls="controls">
  <!-- WebM: 10 MB -->
  <source src="video.webm" type="video/webm" />
  <!-- MPEG-4/H.264: 12 MB -->
  <source src="video.mp4" type="video/mp4" />
  <!-- Ogg/Theora: 13 MB -->
  <source src="video.ogv" type="video/ogv" />

In terms of the order, the browser will download the first video source it understands, so let it hit a smaller one first.  In terms of “smallest”, do make sure that your most compressed video still looks good. There are some compression algorithms that can make your video look like an animated gif. While a 128 Kb video may seem like better user experience than having your users download a 10 MB video, putting a grainy gif-like video behind your content may also negatively impact your brand.

See for current browser support of video and the various media types.  

Remove audio from muted heroes

Lastly, if you do have a hero video or other video without audio, remove the audio from your video file. Remove audio that is muted

<video autoplay="" loop="" muted="true" id="hero-video">
  <source src="banner_video.webm" 
          type='video/webm; codecs="vp8, vorbis"'>
  <source src="web_banner.mp4" type="video/mp4">

This hero video code, common to many conference websites and corporate home pages, includes a video that is auto-playing, looping, and muted. It contains no controls, so there is no way to hear the audio. The audio is often empty, but it is still present. It is still using up bandwidth. There is no reason to serve the audio along with a video that is always muted. Removing the audio can save 20% of the bandwidth, which is 2 MB if your video is 10 MB.

Depending on your video making software, you may be able to remove the audio during export and compression. If not, there is a free tool called FFmpeg that can do it for you with the following command:

ffmpeg -i original.mp4 -an -c:v copy audioFreeVersion.mp4

FFmpeg bills itself as the “complete, cross-platform solution to record, convert and stream audio and video,” which it pretty much is..

Feeding the Diversity Pipeline

Welcoming work environments require multiple components. Most notably, the environment needs to be both diverse and inclusive. A lot of attention has been focused on diversity: on ensuring the pipeline to the workforce has diverse candidates. While it is up to each employer / recruiter to ensure they are reaching out to diverse applicants, many organizations have worked to ensure that, if the recruiter actually looked, they would find diverse candidates.

Below are some organizations doing the work of filling up that pipeline:

National & International “In Person” Groups

Girl Develop It: With locations in 56 cities across 33 US states, Girl Develop provides affordable programs for adult women interested in learning web and software development in a judgment-free environment.

Black Girls Code Black Girls Code provides workshops with the goal of addressing the dearth of African-American women in STEM.

Yes We Code: #YesWeCode targets low-opportunity youth, providing them with the resources and tools to become computer programmers.

Code2040: With outposts in SF, Austin, Chicago, and Durham, Code 2040 creates access, awareness, and opportunities for Black and Latinx engineers

Girls Who Code: Geared toward 13- to 17-year-old girls, Girls Who Code pairs instruction and mentorship to “educate, inspire and equip” students to pursue their engineering and tech dreams.

Write/Speak/Code:  With three chapters in the USA and an annual conference, Write/Speak/Code empowers women and non-binary software developers to become thought leaders, conference speakers, and open source contributors.

Women Who Code: Focusing on women already in tech, the objectives of Women Who Code include providing networking and mentorship opportunities for women in tech around the world

Lesbians who Tech: The goal of Lesbian Who Tech is to make lesbians and their allies in tech more visible to each other and others, to get more women, particularly lesbians, into technology, and to connect lesbians who tech to community LGBTQ and women’s organizations.

Girl Geek Academy: With events in Australia and the USA, Girl Geek Academy initiatives include coding and hackathons, 3D printing and wearables, game development, design, entrepreneurship, and startups. They work with individuals, teachers, schools, corporates, and startups to increase the number of women with professional technical and entrepreneurial skills.

Girls Who Code. With after school clubs for 6th-12th grade girls to explore coding and 7-week summer immersion programs for 10-11th grade girls, Girls Who Code aims to “build the largest pipeline of future female engineers in the United States.”

Girls Inc., Operation SMART: With the premise that girls do like and can be good at STEM, the mission of Girls Inc. is to inspire girls to be strong, smart, and bold. With locations in low-income neighborhoods across the United States and Canada, they provide research-based curricula to equip girls to achieve academically, lead healthy and physically active lives and discover an interest in science, technology, engineering, and math. Girls do like and can be good at STEM. Operation SMART develops girls’ enthusiasm for and skills in STEM through hands-on activities, training, and mentorship.

AnnieCannons: Helps human trafficking survivors gain web development skills A group that creates spaces for women from all levels to engage and talk about their experiences, both positive and negative, and get the support and inspiration, Ladies that UX has over 54 chapters in over 20 countries, spanning four continents.

Technovation: Entrepreneurs, mentors, and educators teaching girls how to become tech entrepreneurs and leaders, from a completing a tech curriculum to launching their mobile app startup.

Regional Efforts

North America

Color Coded: Based in Washington DC, Color Coded is a community for people in tech and people interested in tech.  Their events include workshops, co-working sessions, hackathons, interview preparation, and everything in between.

Native Girls Code: Based in Seattle, Washington, Native Girls Code introduces indigenous teen girls (12 to 18) to opportunities in the field of science, technology, engineering, art, and mathematics (STEAM).

Techqueria!: A community of latinx professionals in the tech industry in greater San Francisco Bay Area.

The Last Mile: Based out of San Quentin Prison is the San Francisco Bay area, The Last Mile teaches inmates CSS, JS, HTML and Python with the aim to provide successful reentry and reduce recidivism.

Coder_Girl: Based in St. Louis, Missouri, CoderGirl is a year-long tech training program consisting of two 6-month cycles: a learning cycle and a project cycle. CoderGirl provides a space for women of all skill levels to learn to code: from Web Development & Design, C#/.NET, SQL, iOS, to Java and UX.

Sisters Code: Based in Detroit, Michigan, Sisters Code’s goal is to educate, empower, and entice women ages 25 – 85 to explore the world of coding and technology. Their weekend-long classes teach women to code interactive websites using JavaScript, HTML and CSS for re-careering into tech.

TechTonica: Free tech training, living and childcare stipends, and job placement for local women & non-binary adults with low incomes in the San Francisco Bay area

Hack the Hood: Web dev workshops and bootcamps for low-income young people of color in the Bay Area and throughout Northern California from Modesto to Gilroy to San Francisco and beyond.

Techfest Club: Event series for women in tech in New York City

Outside the USA

Geekettes: While they have many hubs, currently the Berlin & Minneapolis are hosting tech talks, workshops and hackathons to teach and refine skills and bring women together to create unique, original products.

She Codes: Provides educational events for female software developers in the hi-tech industry throughout Israel.

CodeBar: Free workshops in London bridging the diversity gap.

BlackGirl.Tech:  Providing black women the opportunity to get into coding in London/Bristol

Online Resources:

Programming Language Resources

  • RailsBridge: Free workshops from beginner to intermediate in Rails and Ruby, focus on increasing diversity and inclusion for all genders, races, experiences, etc.
  • LinuxChix: Linux community providing technical and social support for women Linux users.
  • PyLadies: Mentorship focusing on helping more women become active participants and leaders in the Python open-source community.
  • Django Girls: Free Python and Django workshops and open sourced online tutorials.

Soon to be Added:

Additional Resources

  • Under-represented conferences: Twitter list of programming conferences for or about people of color, women, LGBTQI, etc in tech from CallBackWomen
  • Tech Women Programs List: Twitter list of organizations, publications, events and other programs that work to advance technical women via Anita Borg
  • Speaking Advice: Twitter list of resources encouraging people from underrepresented groups to speak at conferences from CallBackWomen
  • Karen Church’s Medium Post: List of 80 Women in STEM SF Bay area resources from 2015.
  • Kira Newman’s list: 30+ Organizations for Women in Technology from 2012
  • Being inclusive in CS: Cynthis Lee’s guide to supporting inclusion in computer science classrooms

Diversity v. Inclusivity

Interviewing diverse candidates will not create a diverse environment. While the above organizations may have filled that diversity pipeline, that pipeline is full of leaks. Diversity recruiting is really only lip service. Work, school, community and conference environments need to be inclusive. Inclusivity in the sealant that prevents many pipeline leaks. Creating an inclusive work environment is necessary, but was not the focus of this post.

Let’s Encrypt on cPanel

Here is how I set up Let’s Encrypt for all sites on my hosted virtual private server running cPanel.

1. SSH as root

$ ssh -p 22 root@

Your port might also be 2200. Ask your VPS hosting provider.

2. Then run the command:

$ /scripts/install_lets_encrypt_autossl_provider

3.Log into your main control panel:

or however you access it (possibly port 2086 if http://)

You should see the following (or something similar) if successful:

Running transaction
  Installing : cpanel-letsencrypt-2.16-3.2.noarch            1/1
  Verifying  : cpanel-letsencrypt-2.16-3.2.noarch            1/1

4. Under SSL/TLS you’ll find “Manage AutoSSL”
Under “providers”, you’ll see “Let’s Encrypt”. That’s a new option that was created by running the command as root.

Select “Let’s Encrypt”. Then agree to their terms of service and create a new registration with Let’s Encrypt if necessary. Under the “managed users” tab you can enable / disable AutoSSL by account.

5. Now, under the control panel of each account, under SECURITY > SSL/TLS, under “Install and Manage SSL for your site (HTTPS)”, if you select “Manage SSL Sites”, you’ll see the Let’s Encrypt  cert.

Note: If you had a self signed certificate (which you don’t want), delete the cert in the individual account. Click the “run AutoSSL for all users” button as root under “Manage Auto SSL”. When you refresh the individual user, the correct cert should be there.

6. Yay. All your accounts now have an SSL Cert. You still need to redirect all of your http:// traffic to https://. In the .htaccess add the following:

RewriteEngine On
RewriteCond %{HTTPS} off
RewriteRule (.*) https://%{HTTP_HOST}/%{REQUEST_URI}

Now redirects to and redirects to

and one of these days this blog will do the same.

W3C Performance Specifications

Here are some of the W3C’s web performance specifications:

  • High Resolution Time (Level 3)

    The DOMHighResTimeStamp type, method, and performance.timeOrigin attributes of the Performance interface resolve issues with monotonically increasing time values with sub-millisecond resolution.

  • Performance Timeline (Level 2)

    Extends definition of the Performance interface, exposes PerformanceEntry in Web Workers and adds support for the PerformanceObserver interface.

  • Resource Timing (Level 3)

    Defines the PerformanceResourceTiming interface providing timing information related to resources in a document.
    Supported in all browser except Safari and Opera Mini, starting with IE10

  • User Timing (Level 2)

    Extends Performance interface with PerformanceMark and PerformanceMeasure.
    Supported in all browser except Safari and Opera Mini, starting with IE10

  • Beacon API

    Defines a beacon API which can “guarantee” asynchronous and non-blocking delivery of data, while minimizing resource contention with other time-critical operations.
    Not supported in IE, Safari or Opera Mini. Support started with Edge 14
    navigator.sendBeacon() on MDN

  • Preload

    Defines preload for resources which need to be fetched as early as possible, without being immediately processed and executed. Preloaded resources can be specified via declarative markup, the Link HTTP header, or scheduled with JS.

  • Cooperative Scheduling of Background Tasks

    Adds the requestIdleCallback method on the Window object, which enables the browser to schedule a callback when it would otherwise be idle, along with the associated cancelIdleCallback and timeRemaining methods.

Capturing Captions from Youtube Videos

Note: This works for pre-captioned videos, not videos auto-captioned by Youtube.

There are many tutorials on how to download the caption files created by Youtube if you own the video, but I was unable to find a way to download the captions of videos I don’t own. There’s probably an easy way to do it, but since I couldn’t find it, creating a JavaScript function to do it via the console took less effort.

Here’s the code (I did it three ways depending on how you like to code your JS)

Constructor method:

function CaptionCollector () {
  var that = this;
  this.captions = '';
  var nowShowing = '';

  this.collect = function(){
    try {
      var currentCaption = document.getElementsByClassName("captions-text")[0].innerText;
    } catch (e) {
      var currentCaption = null;

    if(currentCaption && nowShowing != currentCaption) {
      nowShowing = currentCaption;
      that.captions += ' ' + nowShowing;

    setTimeout(that.collect, 300);

var foo = new CaptionCollector();

Print the caption with foo.captions. Of course you can use anything instead of “foo”.

Here’s a version using JS object notation:

var captionCollector = {
    captions : '',
    nowShowing: '',

    collect : function(){
      try {
        var currentCaption = document.getElementsByClassName("captions-text")[0].innerText;
      } catch (e) {
        var currentCaption = null;
    if(currentCaption && this.nowShowing != currentCaption) {
        this.nowShowing = currentCaption;
        captionCollector.captions += ' ' + captionCollector.nowShowing;
    setTimeout(captionCollector.collect, 300);


With this version, you print the console with captionCollector.captions

Or you can use the anonymous function method with a single global variable:

    ___captions = '';
    var ___nowShowing = '';

    function getCaption() {
        try {
          var currentCaption = document.getElementsByClassName("captions-text")[0].innerText;
        } catch (e) {
          var currentCaption = null;

        if(currentCaption && ___nowShowing != currentCaption) {
          ___nowShowing = currentCaption;
          ___captions += ' ' + ___nowShowing;
        setTimeout(getCaption, 300);


With this version, you print the console with the global variable ____captions

Because it uses the classname of the caption box Youtube uses for videos, this only works on Youtube. Alter the classname for other video services.

You do have to play the whole video to capture all the captions. With settings, you can play the video at twice the speed.

Clear the console when the video ends. Print the transcript to the console. Select all. Copy. You’re good to go.

TWELP: Twitter Help

Updated November 1 to remove .js-activity-generic (who your followees added).
Updated September 15 to add Firefox support.

Sometimes Twitter gets things wrong. Very, very, wrong. A few “features” that I think are bugs include Twitter Moments,

To this end, I created a little bookmarklet called “TWELP”.

<a href="javascript:(function($){function kill(){$('.promoted-tweet, .Icon--heartBadge').closest('').css('display','none');$('.js-moments-tab, .DismissibleModule, .js-activity-generic ').css('display','none');setTimeout(kill, 1000);}kill();})(window.jQuery);">TWELP</a>

Let me rewrite that for you in a version that’s easy to read, but won’t work to copy and paste:

    function kill($) {

         setTimeout(kill, 1000);


The bookmarklet creates a kill function that:

  1. hides promoted tweets by finding the parent tweet containing a promoted-tweet child class
  2. hides any “liked” tweets that contain the heart icon, including uninteresting tweets in your stream suck as the fact that your friend Jane liked a tweet of a picture of her acquaintance Joe, who you are not following, eating an oyster. Seriously, who the fuck cares? It also hides the “people who liked your tweet” feature in your notifications. Not sure if that is a feature or a bug.
  3. The ‘.js-activity-generic’ hides the ‘pat and chris followed leslie’. Seriously, double wtf cares? I am testing in production, so maybe this has some unwanted side effects.
  4. hides the “Moments” tab by hiding the tab that has the  js-moments-tab class
  5. hides promoted modules that I hate like “In Case You Missed It” and “Who to follow”
  6. Calls itself once per second so if you scroll, it will continue killing those annoying tweets mentioned above.
  7. You have to pass window.jQuery to $ because Firefox defines it’s own $. (Thanks to @Potch for that tidbit)

TWELP – You can drag this link to your bookmarks bar, and click TWELP bookmarklet whenever you load Twitter. It kills the “Moments” tab, all ads, and removes the “X liked” tweets.

or, you can wrap your own.

Speed Perception

TL;DR: Please take the speedPerception challenge to help us confirm results about which web performance metrics best match human perception of speed.

Last summer, I was involved in a study called “SpeedPerception”, a large-scale web performance crowdsourced study focused on the perceived loading performance of above-the-fold content aimed at understanding what “slow” and “fast” mean to users. I am now involved in the second part of this study which aims to confirm (or refute) our findings.

SpeedPerception: the general idea

Traditional web performance metrics, like those defined in W3C Navigating Timing draft specification focus on timing each process along the content delivery pipeline, such as Time to First Byte (TTFB) and Page Load Time. SpeedPerception’s goal is to tackle the web performance measurement challenge by looking at it from a different angle: one which puts user experience into focus by focusing on the visual perception of the page load process. We show the user  sample video pairs of websites loading generated with (WPT), and ask them which of the pair they perceive as having loaded faster.

In the first phase, we measured only Internet Retailer top-500 (IR500) sites in desktop size. Now we are testing whether the results we measured are true: in other words, do they only work for our IR500 sites on desktop? Will we get consistent results when testing  Alexa top-1000 (Alex1000) homepages? Will we see the same results if we test on mobile size screens with mobile lie-fi performance?

In this second phase, we’re testing both mobile and desktop versions of both IR500  and Alexa1000 website home pages. We’ve also added a way of measuring the user’s time to click so we can compare apples to apples.

The goal is to create a free, open-source, benchmark dataset to advance the systematic study of how human end-users perceive the webpage loading process: the above-the-fold rendering in particular. Our belief (and hope) is that such a benchmark can provide a quantitative basis to compare different algorithms and spur computer scientists to make progress on helping quantify perceived webpage performance.

Take the Speed Perception challenge !

How was SpeedPerception created?

Videos were created using  Patrick Meenan’s open-source WebPagetest (a.k.a WPT). We made 600+ videos of  2016 IR500 and Alexa-1000 home pages loading. The runs were done in February 2017. Videos were turned into gifs. Video pairs  were grouped using a specific set of rules to help limit bias and randomness. Everything is available on GitHub at

Open Source

Just like we did with the results of phase 1 of SpeedPerception, once the crowd-sourcing component generates a sufficient amount of user data, we will open source the dataset, making it available to the web performance community, along with the analysis of what we discover.

Please help us by taking the SpeedPerception Challenge now. Thanks.

Results from Phase 1

In phase 1, we discovered a combination of three values: an abbreviated SpeedIndex up to time to click (TTC) and an abbreviated Perceptual Speed Index up to TTC, in conjunction with startRender (or Render), can achieve upwards of 85%+ accuracy in explaining majority human A/B choices. Does the power of this new combination “model” hold true for all sites, or just our original data set? This is what we’re working on finding out.

If you’re interested in phase 1, here’s some more light reading:


Leak without a trace: Anonymous Whistle blowing

Instructions on how to leak data without getting caught

  1. Don’t leave digital traces while copying data.
  2. Write stuff down on a pad which belongs to you, and take it home.
  3. Photograph your screen. Don’t create files with copies of the data you are planning to exfiltrate on your work computer.
  4. If the data is on an internal web server, try to access what you are planning to leak in the form of multiple partial queries over a period of time, instead of as one big query.
  5. If you have large volumes of data, save it on a never-used-before USB drive and bring that home. Don’t ever use that USB drive for anything else again.
  6. Don’t use work-owned equipment to post from. Many employers have monitoring software installed, and will easily be able to see who posted what.
  7. Don’t use equipment where you’ve installed employer-supplied monitoring software to post.
  8. If you just read this, on equipment which might be monitored, wait a while before posting anything sensitive. Don’t give somebody who is watching a chance to correlate your seeing this with a post right after that.
  9. Install tor. You can get it at This is a special browser which is slow, but provides strong anonymity. It will prevent anybody at your ISP (if they’re watching) from knowing what sites you visit with it, and it will prevent any sites you visit from knowing what the IP address of your computer is. This will make it much harder for either of them to identify you.
  10. Use the Tor browser for posting your leak.
  11. You must follow these instructions to make sure tor really works:
  12. The instructions about never opening a downloaded file are vitally important — things like .doc and .pdf files can contain software which will expose who you are.
  13. The instruction about using an https version of a web site instead of the plain old http version is also very important. This is because while tor provides very strong anonymity, it doesn’t provide a secure connection to the web site — the https connection to the site does that.
  14. If you’re posting to a social site like twitter or reddit, or using an email account for it, you’ll need to set up a new account for posting your leak.
  15. If you’ve got an account which you have used when not on tor, then Google, twitter, reddit, etc. can identify the IP address of your computer from previous sessions.
  16. If you’ve ever posted anything on an account, there’s a good chance you’ve leaked information about yourself. Don’t take this risk, and just use a new account exclusively for leaking.
  17. Do NOT post files from standard editors, like Word, Excel, or photos from your camera. Most programs and recording equipment embed metadata in their files, like the identify of the creator, the serial number of a camera, and the likes in their files which can be used to identify you. Plain old .txt files are ok. Pretty much anything else risks your identity.
  18. When posting photos, be sure to use a metadata stripping tool like jhead for photos. Turn off photo syncing (e.g., iCloud).
  19. The secure drop systems used by some news outlets like ProPublica and the New York Times may be able to strip this kind of thing, but ask a journalist about your specific file format first.
  20. After posting the leak, you need to NEVER use that account for any purpose not directly tied to the leak, since you may make the mistake of giving away your identity.
  21. If you want to leak to the press (which may get broader coverage, but may also decide not to publish at all) organizations like ProPublica and the New York Times have special drop boxes set up to allow the posting.
  22. Details here: