The 6 best new features in npm 5

Last night I attended waffle.js and heard a great talk by Laurie Voss, aka seldo, the CTO of npm, about new features in npm@5. Several things have been upgraded and new features have been added in npm@5, so, if you haven’t upgraded yet, you’re missing out.

Here are a few things were upgraded in npm@5:

1 ) npm@5 is fast

npm@5 is much faster than previous versions of npm, and as easy to install as:

npm i npm -g

npm@5 is ten times faster than npm@4.
npm@4 is ten times faster than npm@3.
npm@3 is ten times faster than npm@2.
npm@2 is ten times faster than npm@1.

What does this mean?

10 x 10 x 10 x 10 = 104 = 10,000

It means the current version of npm is 10,000 faster than v1. Yes, I am exaggerating, but just a bit. Believe me, npm@5 is super fast.

2) npm@5 includes package-lock.json by default

Package lock makes npm faster because it doesn’t have to look up all the packages: it knows to use the ones you locked in. In addition, with locked packages, you get the same package versions in production as you do in development. This makes builds more reliable. Gone are the days of trying to figure out a bug only to discover it came from others running different versions of a library.

This does change how updates work. You will no longer automatically get the latest updates. Rather, you now need to explicitly update your packages to semver compatible package versions.

npx npm-check -u

The above snippet will make you go thru each npm update, allowing you to accept or decline breaking changes. npx is introduced in #6 below.

3) npm@5 saves by default

The risk of accidentally deleting a package is gone. You no longer have to use --save. If you install a package it will be there. If you don’t want the package anymore, you can delete it.

If you don’t want to save, you can override the default saving with --no-save.

4) Improved cache

Unbeknownst to many, npm has always had a cache — everything downloaded was stored locally — but not a very good one.

The cache improvements in npm@5 include how corrupt caches are handled: cache --clear has been made obsolete as with npm@5, if the cache is corrupted it will not be used.

5) npm@5 works offline

npm@5 auto detects when you are offline. When offline, npm will install using the local, improved, non-corrupted cache, noted in #4 above. You can force npm to run offline when you are actually online, by using --offline, This is handy for less than optimal connections, like when you’re whining about airplane, Starbucks, or conference wifi, and for those on metered data plans.

6) npx package runner

The new npx package runner will download, install, execute, and clean itself up. The order will be executed either locally or from a cache,  installing any dependencies needed. If a full package specifier is included,  npx will use a freshly-installed, temporary version of the package. If no –package option is specified, npx can guess the name of the binary to invoke depending on the specifier provided.

As an example, if you want to create a React app, simply run:

$ npx create-react-app
$ npm start

This will create a working React app.

The npx package runner is probably the most exciting of the six new features. Laurie’s talk covered the syntax for creating working apps for various libraries.

Future feature: cipm

cipm, the continuous integration package manager currently under development, installs npm dependencies from a package lock. Similar in usage to npm installcipm removes node_modules before beginning the install. It works with npm@5’s lock file.  cipm is useful in continuous integration environments requiring regular, full installs of apps that can cache package data in a central cache.

You can take a look at https://github.com/zkat/cipm. Play with it, but don’t put it into production. Yet.

Web Performance: Video Optimization

According to HTTPArchive, sites went from an average of 2,135Kb  to 3,034 KB in the two years from July 1, 2015, to  July 1, 2017. Videos are a major part of that. The average web sites video weight grew from  204 Kb to 729 Kb over the same two year period. The “Low hanging fruit” of performance used to be optimizing images. Now it’s both images and video.

My optimizing video rules:

  1. If possible, omit videos
  2. Compress all videos
  3. Optimize <source> order
  4. Remove audio from muted heroes

If possible, omit videos

The best way to optimize is to remove unneeded content and un-needed requests.

Do you really need a hero video? Do you really need it on the mobile version of your site?

You can use media queries to avoid downloading the #hero-video on narrow screens.

@media screen and (max-width: 650px) { 
  #hero-video { 
      display: none; 
   } 
}

Compress all videos

Most video compression efforts involve comparing adjacent frames within a video and removing details that are the same in the original and subsequent frame. You want to both compress the video and export it to multiple video formats, including WebM, MPEG-4/H.264 and Ogg/Theora.

The software you used to create your video likely includes the ability to optimize the file size down. If not, there are several online tools, like FFmpeg, discussed below, that can help encode, decode, convert and perform other forms of magic.

Optimize <source> order

Order from smallest to largest.  For example, given three video compressions at 10MB, 12MB, and 13MB, but the smallest first and the largest last:

<video width="400" height="300" controls="controls">
  <!-- WebM: 10 MB -->
  <source src="video.webm" type="video/webm" />
  <!-- MPEG-4/H.264: 12 MB -->
  <source src="video.mp4" type="video/mp4" />
  <!-- Ogg/Theora: 13 MB -->
  <source src="video.ogv" type="video/ogv" />
</video>

In terms of the order, the browser will download the first video source it understands, so let it hit a smaller one first.  In terms of “smallest”, do make sure that your most compressed video still looks good. There are some compression algorithms that can make your video look like an animated gif. While a 128 Kb video may seem like better user experience than having your users download a 10 MB video, putting a grainy gif-like video behind your content may also negatively impact your brand.

See CanIUse.com for current browser support of video and the various media types.  

Remove audio from muted heroes

Lastly, if you do have a hero video or other video without audio, remove the audio from your video file. Remove audio that is muted

<video autoplay="" loop="" muted="true" id="hero-video">
  <source src="banner_video.webm" 
          type='video/webm; codecs="vp8, vorbis"'>
  <source src="web_banner.mp4" type="video/mp4">
</video>

This hero video code, common to many conference websites and corporate home pages, includes a video that is auto-playing, looping, and muted. It contains no controls, so there is no way to hear the audio. The audio is often empty, but it is still present. It is still using up bandwidth. There is no reason to serve the audio along with a video that is always muted. Removing the audio can save 20% of the bandwidth, which is 2 MB if your video is 10 MB.

Depending on your video making software, you may be able to remove the audio during export and compression. If not, there is a free tool called FFmpeg that can do it for you with the following command:

ffmpeg -i original.mp4 -an -c:v copy audioFreeVersion.mp4

FFmpeg bills itself as the “complete, cross-platform solution to record, convert and stream audio and video,” which it pretty much is..

Feeding the Diversity Pipeline

Welcoming work environments require multiple components. Most notably, the environment needs to be both diverse and inclusive. A lot of attention has been focused on diversity: on ensuring the pipeline to the workforce has diverse candidates. While it is up to each employer / recruiter to ensure they are reaching out to diverse applicants, many organizations have worked to ensure that, if the recruiter actually looked, they would find diverse candidates.

Below are some organizations doing the work of filling up that pipeline:

National & International “In Person” Groups

Girl Develop It: With locations in 56 cities across 33 US states, Girl Develop provides affordable programs for adult women interested in learning web and software development in a judgment-free environment.

Black Girls Code Black Girls Code provides workshops with the goal of addressing the dearth of African-American women in STEM.

Yes We Code: #YesWeCode targets low-opportunity youth, providing them with the resources and tools to become computer programmers.

Code2040: With outposts in SF, Austin, Chicago, and Durham, Code 2040 creates access, awareness, and opportunities for Black and Latinx engineers

Girls Who Code: Geared toward 13- to 17-year-old girls, Girls Who Code pairs instruction and mentorship to “educate, inspire and equip” students to pursue their engineering and tech dreams.

Write/Speak/Code:  With three chapters in the USA and an annual conference, Write/Speak/Code empowers women and non-binary software developers to become thought leaders, conference speakers, and open source contributors.

Women Who Code: Focusing on women already in tech, the objectives of Women Who Code include providing networking and mentorship opportunities for women in tech around the world

Lesbians who Tech: The goal of Lesbian Who Tech is to make lesbians and their allies in tech more visible to each other and others, to get more women, particularly lesbians, into technology, and to connect lesbians who tech to community LGBTQ and women’s organizations.

Girl Geek Academy: With events in Australia and the USA, Girl Geek Academy initiatives include coding and hackathons, 3D printing and wearables, game development, design, entrepreneurship, and startups. They work with individuals, teachers, schools, corporates, and startups to increase the number of women with professional technical and entrepreneurial skills.

Girls Who Code. With after school clubs for 6th-12th grade girls to explore coding and 7-week summer immersion programs for 10-11th grade girls, Girls Who Code aims to “build the largest pipeline of future female engineers in the United States.”

Girls Inc., Operation SMART: With the premise that girls do like and can be good at STEM, the mission of Girls Inc. is to inspire girls to be strong, smart, and bold. With locations in low-income neighborhoods across the United States and Canada, they provide research-based curricula to equip girls to achieve academically, lead healthy and physically active lives and discover an interest in science, technology, engineering, and math. Girls do like and can be good at STEM. Operation SMART develops girls’ enthusiasm for and skills in STEM through hands-on activities, training, and mentorship.

AnnieCannons: Helps human trafficking survivors gain web development skills

LadiesThatUX.com: A group that creates spaces for women from all levels to engage and talk about their experiences, both positive and negative, and get the support and inspiration, Ladies that UX has over 54 chapters in over 20 countries, spanning four continents.

Technovation: Entrepreneurs, mentors, and educators teaching girls how to become tech entrepreneurs and leaders, from a completing a tech curriculum to launching their mobile app startup.

Regional Efforts

North America

Color Coded: Based in Washington DC, Color Coded is a community for people in tech and people interested in tech.  Their events include workshops, co-working sessions, hackathons, interview preparation, and everything in between.

Native Girls Code: Based in Seattle, Washington, Native Girls Code introduces indigenous teen girls (12 to 18) to opportunities in the field of science, technology, engineering, art, and mathematics (STEAM).

Techqueria!: A community of latinx professionals in the tech industry in greater San Francisco Bay Area.

The Last Mile: Based out of San Quentin Prison is the San Francisco Bay area, The Last Mile teaches inmates CSS, JS, HTML and Python with the aim to provide successful reentry and reduce recidivism.

Coder_Girl: Based in St. Louis, Missouri, CoderGirl is a year-long tech training program consisting of two 6-month cycles: a learning cycle and a project cycle. CoderGirl provides a space for women of all skill levels to learn to code: from Web Development & Design, C#/.NET, SQL, iOS, to Java and UX.

Sisters Code: Based in Detroit, Michigan, Sisters Code’s goal is to educate, empower, and entice women ages 25 – 85 to explore the world of coding and technology. Their weekend-long classes teach women to code interactive websites using JavaScript, HTML and CSS for re-careering into tech.

TechTonica: Free tech training, living and childcare stipends, and job placement for local women & non-binary adults with low incomes in the San Francisco Bay area

Hack the Hood: Web dev workshops and bootcamps for low-income young people of color in the Bay Area and throughout Northern California from Modesto to Gilroy to San Francisco and beyond.

Techfest Club: Event series for women in tech in New York City

Outside the USA

Geekettes: While they have many hubs, currently the Berlin & Minneapolis are hosting tech talks, workshops and hackathons to teach and refine skills and bring women together to create unique, original products.

She Codes: Provides educational events for female software developers in the hi-tech industry throughout Israel.

CodeBar: Free workshops in London bridging the diversity gap.

BlackGirl.Tech:  Providing black women the opportunity to get into coding in London/Bristol

Online Resources:

Programming Language Resources

  • RailsBridge: Free workshops from beginner to intermediate in Rails and Ruby, focus on increasing diversity and inclusion for all genders, races, experiences, etc.
  • LinuxChix: Linux community providing technical and social support for women Linux users.
  • PyLadies: Mentorship focusing on helping more women become active participants and leaders in the Python open-source community.
  • Django Girls: Free Python and Django workshops and open sourced online tutorials.

Soon to be Added:

Additional Resources

  • Under-represented conferences: Twitter list of programming conferences for or about people of color, women, LGBTQI, etc in tech from CallBackWomen
  • Tech Women Programs List: Twitter list of organizations, publications, events and other programs that work to advance technical women via Anita Borg
  • Speaking Advice: Twitter list of resources encouraging people from underrepresented groups to speak at conferences from CallBackWomen
  • Karen Church’s Medium Post: List of 80 Women in STEM SF Bay area resources from 2015.
  • Kira Newman’s list: 30+ Organizations for Women in Technology from 2012
  • Being inclusive in CS: Cynthis Lee’s guide to supporting inclusion in computer science classrooms

Diversity v. Inclusivity

Interviewing diverse candidates will not create a diverse environment. While the above organizations may have filled that diversity pipeline, that pipeline is full of leaks. Diversity recruiting is really only lip service. Work, school, community and conference environments need to be inclusive. Inclusivity in the sealant that prevents many pipeline leaks. Creating an inclusive work environment is necessary, but was not the focus of this post.

Let’s Encrypt on cPanel

Here is how I set up Let’s Encrypt for all sites on my hosted virtual private server running cPanel.

1. SSH as root

$ ssh -p 22 root@123.1.23.123

Your port might also be 2200. Ask your VPS hosting provider.

2. Then run the command:

$ /scripts/install_lets_encrypt_autossl_provider

3.Log into your main control panel:

https://123.1.23.123:2087

or however you access it (possibly port 2086 if http://)

You should see the following (or something similar) if successful:

Running transaction
  Installing : cpanel-letsencrypt-2.16-3.2.noarch            1/1
 
  Verifying  : cpanel-letsencrypt-2.16-3.2.noarch            1/1
 
Complete!

4. Under SSL/TLS you’ll find “Manage AutoSSL”
Under “providers”, you’ll see “Let’s Encrypt”. That’s a new option that was created by running the command as root.

Select “Let’s Encrypt”. Then agree to their terms of service and create a new registration with Let’s Encrypt if necessary. Under the “managed users” tab you can enable / disable AutoSSL by account.

5. Now, under the control panel of each account, under SECURITY > SSL/TLS, under “Install and Manage SSL for your site (HTTPS)”, if you select “Manage SSL Sites”, you’ll see the Let’s Encrypt  cert.

Note: If you had a self signed certificate (which you don’t want), delete the cert in the individual account. Click the “run AutoSSL for all users” button as root under “Manage Auto SSL”. When you refresh the individual user, the correct cert should be there.

6. Yay. All your accounts now have an SSL Cert. You still need to redirect all of your http:// traffic to https://. In the .htaccess add the following:

RewriteEngine On
RewriteCond %{HTTPS} off
RewriteRule (.*) https://%{HTTP_HOST}/%{REQUEST_URI}

Now http://machinelearningworkshop.com/ redirects to https://machinelearningworkshop.com and http://www.machinelearningworkshop.com redirects to https://www.machinelearningworkshop.com/

and one of these days this blog will do the same.

W3C Performance Specifications

Here are some of the W3C’s web performance specifications:

  • High Resolution Time (Level 3)

    The DOMHighResTimeStamp type, performance.now method, and performance.timeOrigin attributes of the Performance interface resolve Date.now() issues with monotonically increasing time values with sub-millisecond resolution.

    https://w3c.github.io/hr-time

  • Performance Timeline (Level 2)

    Extends definition of the Performance interface, exposes PerformanceEntry in Web Workers and adds support for the PerformanceObserver interface.

    https://w3c.github.io/performance-timeline

  • Resource Timing (Level 3)

    Defines the PerformanceResourceTiming interface providing timing information related to resources in a document.

    https://w3c.github.io/resource-timing https://w3c.github.io/navigation-timing
    Supported in all browser except Safari and Opera Mini, starting with IE10

  • User Timing (Level 2)

    Extends Performance interface with PerformanceMark and PerformanceMeasure.

    https://w3c.github.io/user-timing
    Supported in all browser except Safari and Opera Mini, starting with IE10

  • Beacon API

    Defines a beacon API which can “guarantee” asynchronous and non-blocking delivery of data, while minimizing resource contention with other time-critical operations.

    https://w3c.github.io/beacon
    Not supported in IE, Safari or Opera Mini. Support started with Edge 14
    navigator.sendBeacon() on MDN

  • Preload

    Defines preload for resources which need to be fetched as early as possible, without being immediately processed and executed. Preloaded resources can be specified via declarative markup, the Link HTTP header, or scheduled with JS.

    https://w3c.github.io/preload

  • Cooperative Scheduling of Background Tasks

    Adds the requestIdleCallback method on the Window object, which enables the browser to schedule a callback when it would otherwise be idle, along with the associated cancelIdleCallback and timeRemaining methods.

    https://w3c.github.io/requestidlecallback

TWELP: Twitter Help

Note: Updated September 15 to add Firefox support.

Sometimes Twitter gets things wrong. Very, very, wrong. A few “features” that I think are bugs include Twitter Moments,

To this end, I created a little bookmarklet called “TWELP”.

<a href="javascript:(function($){function kill(){$('.promoted-tweet, .Icon--heartBadge').closest('li.stream-item').css('display','none');$('.js-moments-tab, .DismissibleModule').css('display','none');setTimeout(kill, 1000);}kill();})(window.jQuery);">TWELP</a>

Let me rewrite that for you in a version that’s easy to read, but won’t work to copy and paste:

(function(){
    function kill($) {
         $('.promoted-tweet, .Icon--heartBadge').closest('li.stream-item').css('display','none');
         $('.js-moments-tab, .DismissibleModule').css('display','none');

         setTimeout(kill, 1000);
     }

     kill();
})(window.jQuery);

The bookmarklet creates a kill function that:

  1. hides promoted tweets by finding the parent tweet containing a promoted-tweet child class
  2. hides any “liked” tweets that contain the heart icon, including uninteresting tweets in your stream suck as the fact that your friend Jane liked a tweet of a picture of her acquaintance Joe, who you are not following, eating an oyster. Seriously, who the fuck cares? It also hides the “people who liked your tweet” feature in your notifications. Not sure if that is a feature or a bug.
  3. hides the “Moments” tab by hiding the tab that has the  js-moments-tab class
  4. hides promoted modules that I hate like “In Case You Missed It” and “Who to follow”
  5. Calls itself once per second so if you scroll, it will continue killing those annoying tweets mentioned above.
  6. You have to pass window.jQuery to $ because Firefox defines it’s own $. (Thanks to @Potch for that tidbit)

TWELP – You can drag this link to your bookmarks bar, and click TWELP bookmarklet whenever you load Twitter. It kills the “Moments” tab, all ads, and removes the “X liked” tweets.

or, you can wrap your own.

Speed Perception

TL;DR: Please take the speedPerception challenge to help us confirm results about which web performance metrics best match human perception of speed.

Last summer, I was involved in a study called “SpeedPerception”, a large-scale web performance crowdsourced study focused on the perceived loading performance of above-the-fold content aimed at understanding what “slow” and “fast” mean to users. I am now involved in the second part of this study which aims to confirm (or refute) our findings.

SpeedPerception: the general idea

Traditional web performance metrics, like those defined in W3C Navigating Timing draft specification focus on timing each process along the content delivery pipeline, such as Time to First Byte (TTFB) and Page Load Time. SpeedPerception’s goal is to tackle the web performance measurement challenge by looking at it from a different angle: one which puts user experience into focus by focusing on the visual perception of the page load process. We show the user  sample video pairs of websites loading generated with http://www.webpagetest.org/ (WPT), and ask them which of the pair they perceive as having loaded faster.

In the first phase, we measured only Internet Retailer top-500 (IR500) sites in desktop size. Now we are testing whether the results we measured are true: in other words, do they only work for our IR500 sites on desktop? Will we get consistent results when testing  Alexa top-1000 (Alex1000) homepages? Will we see the same results if we test on mobile size screens with mobile lie-fi performance?

In this second phase, we’re testing both mobile and desktop versions of both IR500  and Alexa1000 website home pages. We’ve also added a way of measuring the user’s time to click so we can compare apples to apples.

The goal is to create a free, open-source, benchmark dataset to advance the systematic study of how human end-users perceive the webpage loading process: the above-the-fold rendering in particular. Our belief (and hope) is that such a benchmark can provide a quantitative basis to compare different algorithms and spur computer scientists to make progress on helping quantify perceived webpage performance.

Take the Speed Perception challenge !

How was SpeedPerception created?

Videos were created using  Patrick Meenan’s open-source WebPagetest (a.k.a WPT). We made 600+ videos of  2016 IR500 and Alexa-1000 home pages loading. The runs were done in February 2017. Videos were turned into gifs. Video pairs  were grouped using a specific set of rules to help limit bias and randomness. Everything is available on GitHub at https://github.com/pdey/SpeedPerception.

Open Source

Just like we did with the results of phase 1 of SpeedPerception, once the crowd-sourcing component generates a sufficient amount of user data, we will open source the dataset, making it available to the web performance community, along with the analysis of what we discover.

Please help us by taking the SpeedPerception Challenge now. Thanks.

Results from Phase 1

In phase 1, we discovered a combination of three values: an abbreviated SpeedIndex up to time to click (TTC) and an abbreviated Perceptual Speed Index up to TTC, in conjunction with startRender (or Render), can achieve upwards of 85%+ accuracy in explaining majority human A/B choices. Does the power of this new combination “model” hold true for all sites, or just our original data set? This is what we’re working on finding out.

If you’re interested in phase 1, here’s some more light reading:

 

Survey for US Developers’ Experience

I created a survey of developer experiences for US based developers.
While the results aren’t scientific in the the questions and optional answers were not all that well written, and the distribution may have been biases (only reached Twitter users), some of the results were indeed interesting, and may lead to me doing some further, more professional, study.

According to their own self reporting, men experience the following more than women do:

  • People assume more men developers have a CS degree (1.18)
  • People think men developers are very skilled at tech when they first meet them. (0.89)
  • People assume men  developers are good at math (0.80)
  • People make accurate judgements as to men developers ‘s skill set when they meet them (0.73)
  • Men developers sign up for conferences or RSVP for meetups and let their bosses and partners know the plans them made (0.65)
  • People assume men developers are gamers (0.59)
  • People listen to men developers  (0.54)
  • People assume male developers are straight (0.52)
  • Men developers  provide constructive criticism to others online (0.51)
  • People value men developers ‘s opinion (0.48)
  • People assume men  developers are correct (0.45)

According to their own self reporting, women experience the following more than men do:

  • People assume women developers are not technical when they meet them (2.22)
  • Women developers  are  told to be “nice” or to “smile” (1.98)
  • Women developers  have been called bossy (1.80)
  • Women developers  are the ones that makes sure they have a housesitter, babysitter, dogsitter, other so they can attend events such as conferences. (1.75)
  • Women developers have been called aggressive 1.65
  • People question women developers ‘s accuracy and/or knowledge (1.42)
  • Women  developers think about transportation to and from the venue before agreeing to attend events (1.25)
  • People explain women  developers ‘s own content to them, like their tweets and posts (1.18)
  • Women  developers go into work on the weekend (1.18)
  • People assume women  developers  know photoshop or have other design skills (1.14)
  • People comment on women developers ‘s appearance 1.12
  • Women  developers think about the different ages of their co-workers (1.06)
  • People interrupt women developers (1.00)
  • People are surprised to learn women developers are gamers  (0.98)
  • Women  developers think about what they are going to wear to work. (0.98)
  • People think women developers are administrative or custodial staff (0.95)

Lyrinė Tālrunis: “Telephone” with Lyric Translations

Last week I hacked together a mashup of HP’s IdolOnDemand‘s free Speech Recognition API and Google’s fee-for-service Translation API to create a Lyric Translator. Of course, I had to make the site responsive using VW units for text, and Flexbox for the layout. I also used a datalist to provide an optional list of usable media files. Then, tonight, I wrote a blog post explaining these components.

Enjoy!

The app

Lyrinė Tālrunis, or “Lyric Telephone”, does two things:

  1. It captures the text from an audio file using the free Speech Recognition API from HP IdolOnDemand,
  2. It then translates the captured text into from the original language into French, German, Spanish, Vietnamese, Russian and then back to the original language, enabling you to create new lyrics for — or a more entertaining interpretation of — your favorite songs.

The name is basically Lyrics, as in lyrics from songs, but you can use any media file, and ‘Telephone’ in reference to the game of telephone you learned in pre-K, where when a story gets passed to too many people (or languages) the original text gets morphed into something else.

As deployed, the app provides for two preloaded options of Rudolph the Red-nosed Reindeer and Frosty the Snowman, but you can include a link to any .wav or .mp3 file you find on line.

I’ll be expanding the application to allow for file uploads, and hope to implement uploading directly from your device’s microphone with:

<input type="file" accept="audio/*;capture=microphone">

For right now, simply find a song online, or even a video file, and include the full URL in the input box.

HP’s IdolOnDemand Speech Recognition API

IDOL OnDemand’s Speech Recognition API creates a transcript of the text in an audio or video file. It supports seven languages – including both US an UK English (yes, it will produce “colour” instead of “color” if you ask it to).

You do first need to register with IdolOnDemand to get an API Key. Then it’s a simple AJAX call.

The values you need create the request URL for the speech recognitio API include:

  • The encoded URI to your .wav or other audio of video file.
  • Your API Key
  • The language your file is in (defaults to en-US)

Here is how I created my URL:

  var apikey = YOUR_HP_API_KEY;
      url = document.getElementById('url').value,
      language = document.getElementById('lang').value, 
      query = 'https://api.idolondemand.com/1/api/sync/recognizespeech/v1' + '?';       
  query  += "&url=" + encodeURIComponent(url);
  query  += "&apikey=" + apikey;
  query  += "&language=" + ( language || "en-US");

Where url is the id of the input where the user enters the full path to the media file. The input has 3 default options for you to choose from, but you can enter any text you wish. This is explained in the <datalist> / list attribute section below.

The ‘lang’ is the ID of the <select> drop down that list the langage of the media file.

<select name="lang" id="lang">
  <option value="en-US">English (US)</option>
  <option value="en-GB">English (UK)</option>
  <option value="de-DE">German</option>
  <option value="es-ES">Español</option>
  <option value="fr-FR">Français</option>
  <option value="it-IT">Italiano</option>
</select>

The speech recognition API works for all the above languages and Chinese. Surprisingly, I can actually read and understand all of the languages above (don’t ask).  As I can’t read Chinese and wouldn’t be able to debug it, I didn’t include it in this app.  If you want to include Chinese as an option, please feel free to fork the app repo and add Chinese back in, but please use your own API key.

IDOL OnDemand’s Speech Recognition API uses the language code and the country subcode, so use the long form like “en-US” and not “en”. Don’t know what I am talking about (or have insomnia)? Read up on language tags.

Depending on the file size, your file can take a while to process, so make sure to let your user know something is happening, and make sure to handle errors in case it times out. For better user experience, when the button gets clicked, calling the API, the content of the button changes to a rotating pipe in the hopes of making a quick and easy spinner. (Check out the CSS file if you want to learn the animation, as I am not covering it here). The animation stops and the button returns to the original text when the text extraction of the media file is returned from the Speech API.

 var app = {
     ...
    
     init : function () {
	     // add eventHandler to button
        document.getElementById('doThis').addEventListener('click', function(){
          app.submitToHP();
          app.changeButton();
        });
      },

      // get the words from the original media file
      // the `data` object contains default values & the `apikey`
      submitToHP : function () {
        data.url = document.getElementById(
url').value || data.url;
        var query = data.request_url + '?';
            query += "&url=" + encodeURIComponent(data.url);
            query += "&apikey=" + data.apikey;
            query += "&language=" + document.getElementById('lang').value || "en-US");

        var request = $.ajax(query, function(e) {
                // successfully sent - no actions
              })
            .done(function(e) {
                // response received - handle it
                app.acceptResponse(e.document[0].content);
              })
            .fail(function(e) {
                // error - handle it
                app.acceptResponse('Oops, something went wrong.');
              })
            .always(function(e) {
                // finished - stop button animation
                app.revertButton();
            });
      },
      ... // continues

The reqest returns a JSON object:

{
  "document": [
    {
      "content": "the media speech is here"
    }
  ]
}

So we grab that content with:

e.document[0].content

where e is the response.

The other functions included, but not described include:

  • app.changeButton() — changes button to a spinner
  • app.revertButton() — resets the button to original behavior
  • app.acceptResponse(e.document[0].content) — writes text to the page, and initiates translations, which are done via the Goole Translate API

Google Translate API

I tried finding a good, free, intuitive, easy to use, translation API, but came up empty handed. Sorry Microsoft, you have too many steps, and I couldn’t just “dive right in.” I do, however, have the Yandex translation API on my list of things to look up. It looks like it might be a good free alternative to Google’s fee-for-service translation API.

Time is money, and I was already familiar with the Google Translate API, which is the main reason I chose it. Again, fork me repo to try something different.

The request URL for Google Translate API looks something like this:

var request_URL = "https://www.googleapis.com/language/translate/v2?key=" +  
  YOUR_GOOGLE_API_KEY +
  "&source=" + from +
  "&target=" + to +
  "&q=" + encodeURIComponent(text);

Where you use your own Google API Key, the ‘from’ is the original language, the ‘to’ is the language you want to translate to, and the text you encode with encodeURIComponent(text) is the return value from the Speech Recognition API, what we captured as e.document[0].content above.

As we are translating from the original language to French, German, Spanish, Vietnamese, Russian and then back again to the original language, we are calling the Google API 6 times. The important information is how to create the link for Google’s REST API. You can take a look at the source code to see how I iterate thru the various languages and call the AJAX call for each translation.

CSS3 Values

The page is full responsive. On larger devices, the font is larger. This is done without @media queries. I know. I know. Media queries are all the rage. But they’re not always necessary. CSS3 provides responsive features that enable the creation of responsive content without having to define where your design splits. The browser can do it for you.

In this case, it’s the VW, or viewport width unit, that makes the site naturally responsive. The VW unit is relative to the viewport width: the viewport width is 100vw. If you don’t know all your length units, my 4 year old post needs some updating.

  h1 {
    line-height: 30vh;
    font-size: 8vw;
  }

The above CSS snippet reads “the line height should be 30% of the viewport height. The font size should be 8% of the viewport width”

As the viewport narrows, the font-size will shrink. Yes, it will get illegible if the viewport is too narrow, but no phone is narrow enough to make that illegible. That size is relatively huge, and will even be legible on new watches. And the height shrinks, so does the line height, meaning the h1 will never be taller than 30% of the height of the viewport.

I used VW and VH for the height of the blue header and containers for the translation content: even when empty, the articles will be 40% of the document height.

The content in the button and the input also grow and shrink as the viewport grows or shrinks. vh and vw are very well supported, though vmax, the maximum value of the viewport width and height, and vmin, the lesser of those 2 values, is not fully supported.

CSS Flexible Box Layout Module

CSS Layout is fun! No. Seriously. It is. Just use flexbox.

body {
  display: flex;
  flex-direction: column;
  flex: 1 0 300px;
}
header, article {
  flex: 3;
  min-height: 40vh;
}
footer, main {
  flex: 1; 
  max-height: 10vh;
}
main, article {
  display: flex;
  justify-content: space-around;
  align-items: center;
}
section {
  flex: 1;
}
@media screen and (max-width: 500px) {
  main, article {
    flex-wrap: wrap;
  }
}

Above is just part of the CSS. I’ve posted the CSS relevant to the flexible layout of the document. You’ll note that the layout has four vertical sections: the header, main section with the buttons, the articles where the translations go, and the footer.

The body CSS code block above makes the body a flex container, and the header, main, article and footer all flex items. The flex-direction is column, so they’re one on top of another instead of row, which would put the parts side by side.

The main and article are not only flex items, but they, in turn, are also flex containers. The default flex-direction is row, so the children of the main and the children of the article will be laid out side by side within their parent. We have two flex items within the article (the original text result from the Speech API and the final result processed thru 5 translations.) These will be side by side, and will always be of equal height. They will not wrap by default, but if the width of the screen is 500px wide or less, the row of content can wrap, which means the translation can land below the text capture, and the input can land below the button, which can fall below the language selection.

This project is to show a very simple example of flex layout, and is not meant as a full flexbox tutorial. To learn more about flexbox, I have an open source flexbox tutorial you can play with. There you can see that justify-content: space-around; means that the extra space around the items will be evenly distributed around each item, and all the other flexbox, including several not included in this demonstration.

Datalist element and list attribute

Take a look at the input on the page. You’ll note, in modern browsers, there is a little arrow on the right, which if clicked, shows an autocomplete. This effect is achieved using the HTML5 list attribute along with the <datalist> element and that element’s nested <options>.

<input type="url" list="urls" id="url" placeholder=" URL of .wav file">
    
<datalist id="urls">
  <option value="http://estelle.github.io/audiotranslator/data/frosty.wav">Frosty</option>
  <option value="http://estelle.github.io/audiotranslator/data/rudolph.wav">Rudolph</option>
  <option value="http://estelle.github.io/audiotranslator/data/tet.wav">Test File</option>
</datalist>

In this example, we have an <input> of type url, with a list attribute. The value of the list attribute is the value of the ID of the datalist element. This associates the urls datalist with the input. If a browser doesn’t support datalist, it will simply not show the datalist. Totally progressive enhancement: the form control is still usable in Netscape 4.7

If the browser does support datalist, a drop down menu of the options will show when the input has focus. It will only show the values that are still potentionally valid. If you enter ‘h’, all the values will still show. ANy other character, and the options will no longer be valide options, and they will no longer be displayed. I have a tutorial on web forms, which demonstrates the inclusion of datalist on text, url, email, number, color, range, date and time input types.

Future of the app

Here are some ideas I have for expanding the application. Please fork and do it for me :D

  • File upload option
  • Drag and Drop from desktop to upload file
  • Audio Capture directly from your device
  • Inclusion of original <audio> for your listening pleasure