mozilla

Mozilla Nederland LogoDe Nederlandse
Mozilla-gemeenschap

Daniel Stenberg: The command line options we deserve

Mozilla planet - do, 20/02/2020 - 20:08

A short while ago curl‘s 230th command line option was added (it was --mail-rcpt-allowfails). Two hundred and thirty command line options!

A look at curl history shows that on average we’ve added more than ten new command line options per year for very long time. As we don’t see any particular slowdown, I think we can expect the number of options to eventually hit and surpass 300.

Is this manageable? Can we do something about it? Let’s take a look.

Why so many options?

There are four primary explanations why there are so many:

  1. curl supports 24 different transfer protocols, many of them have protocol specific options for tweaking their operations.
  2. curl’s general design is to allow users to individual toggle and set every capability and feature individually. Every little knob can be on or off, independently of the other knobs.
  3. curl is feature-packed. Users can customize and adapt curl’s behavior to a very large extent to truly get it to do exactly the transfer they want for an extremely wide variety of use cases and setups.
  4. As we work hard at never breaking old scripts and use cases, we don’t remove old options but we can add new ones.
An entire world already knows our options

curl is used and known by a large amount of humans. These humans have learned the way of the curl command line options, for better and for worse. Some thinks they are strange and hard to use, others see the logic behind them and the rest simply accepts that this is the way they work. Changing options would be conceived as hostile towards all these users.

curl is used by a very large amount of existing scripts and programs that have been written and deployed and sit there and do their work day in and day out. Changing options, even ever so slightly, risk breaking some to many of these scripts and make a lot of developers out there upset and give them everything from nuisance to a hard time.

curl is the documented way to use APIs, REST calls and web services in numerous places. Lots of examples and tutorials spell out how to use curl to work with services all over the world for all sorts of interesting things. Examples and tutorials written ages ago that showcase curl still work because curl doesn’t break behavior.

curl command lines are even to some extent now used as a translation language between applications:

All the four major web browsers let you export HTTP requests to curl command lines that you can then execute from your shell prompts or scripts. Other web tools and proxies can also do this.

There are now also tools that can import said curl command lines so that they can figure out what kind of transfer that was exported from those other tools. The applications that import these command lines then don’t want to actually run curl, they want to figure out details about the request that the curl command line would have executed (and instead possibly run it themselves). The curl command line has become a web request interchange language!

There are also a lot of web services provided that can convert a curl command line into a source code snippet in a variety of languages for doing the same request (using the language’s native preferred method). A few examples of this are: curl as DSL, curl to Python Requests, curl to Go, curl to PHP and curl to perl.

Can the options be enhanced?

I hope I’ve made it clear why we need to maintain the options we already support. However, it doesn’t limit what we can add or that we can’t add new ways of doing things. It’s just code, of course it can be improved.

We could add alternative options for existing ones that make sense, if there are particular ones that are considered complicated or messy.

We could add a new “mode” that would have a totally new set of options or new way of approaching what we think of options today.

Heck, we could even consider doing a separate tool next to curl that would similarly use libcurl for the transfers but offer a totally new command line option approach.

None of these options are ruled out as too crazy or far out. But they all of course require that someone think of them as a good ideas and is prepared to work on making it happen.

Are the options hard to use?

Oh what a subjective call.

Packing this many features into a single tool and having every single option and superpower intuitive and easy-to-use is perhaps not impossible but at least a very very hard task. Also, the curl command line options have been added organically over a period of over twenty years so some of them of could of course have been made a little smarter if we could’ve foreseen what would come or how the protocols would later change or improve.

I don’t think curl is hard to use (what you think I’m biased?).

Typical curl use cases often only need a very small set of options. Most users never learn or ever need to learn most curl options – but they are there to help out when the day comes and the user wants that particular extra quirk in their transfer.

Using any other tool, even those who pound their chest and call themselves user-friendly, if they grow features close to the amount of abilities that curl can do, such command lines also grow substantially and will no longer always be intuitive and self-explanatory. I don’t think a very advanced tool can remain easy to use in all circumstances. I think the aim should be to keep the commonly use cases easy. I think we’ve managed this in curl, in spite of having 230 different command line options.

Should we have another (G)UI?

Would a text-based or even graphical UI help or improve curl? Would you use one if it existed? What would it do and how would it work? Feel most welcome to tell me!

Related

See the cheat sheet refreshed and why not the command line option of the week series.

Categorieën: Mozilla-nl planet

Mozilla Localization (L10N): L10n Report: February Edition

Mozilla planet - wo, 19/02/2020 - 22:22
Welcome!

Please note some of the information provided in this report may be subject to change as we are sometimes sharing information about projects that are still in early stages and are not final yet. 

New localizers

  • Bora of Kabardian joined us through the Common Voice project
  • Kevin of Swahili
  • Habib and Shamim of Bengali joined us through the WebThings Gateway project

Are you a locale leader and want us to include new members in our upcoming reports? Contact us!

New community/locales added

New content and projects What’s new or coming up in Firefox desktop

Firefox is now officially following a 4-weeks release cycle:

  • Firefox 74 is currently in beta and it will be released on March 10. The deadline to update your localization is February 25.
  • Firefox 75, currently in Nightly, will move to Beta when 74 is officially released. The deadline to update localizations for that version will be on March 24 (4 weeks after the current deadline).

In terms of localization priority and deadlines, note that the content of the What’s new panel, available at the bottom of the hamburger menu, doesn’t follow the release train. For example, content for 74 has been exposed on February 17, and it will be possible to update translations until the very end of the cycle (approximately March 9), beyond the normal deadline for that version.

What’s new or coming up in mobile

We have some exciting news to announce on the Android front!

Fenix, the new Android browser, is going to release in April. The transition from Firefox for Android (Fennec) to Fenix has already begun!  Now that we have an in-app locale switcher in place, we have the ability to add languages even when they are not supported by the Android system itself.

As a result, we’ve opened up the project on Pontoon to many new locales (89 total). Our goal is to reach Firefox for Android parity in terms of completion and number of locales.

This is a much smaller project than Firefox for Android, and a very innovative and quick software. We hope this excites you as much as us. And we truly hope to deliver to users across the world the same experience as with Firefox for Android, in terms of localization.

Delphine will be away for the next few months. Jeff is standing in for her on the PM front, with support from Flod on the technical front. While Delphine is away, we won’t be enabling new locales on mobile products outside of Fenix. This is purely because our current resourcing allows us to give Fenix the priority, but at the expense of other products. Stay tuned for more open and individual outreach from Jeff about Fenix and other mobile projects.

What’s new or coming up in web projects Mozilla.org

Changes are coming to mozilla.org. The team behind mozilla.org has been working all year to transition from the .lang format to Fluent. Communications on the details around this transition will be coming through the mailing list.

Additionally, the following pages were added since the last report

New:

  • firefox/browsers.lang
  • firefox/products.lang
  • firefox/whatsnew_73.lang
  • firefox/set-default-thanks.lang
Community Participation Guideline (CPG)

The CPG has a major update, including a new page and additional locales. Feel free to review and provide feedback by filing a bug.

Languages:  ar, de, es-ES, fr, hi-IN, id, it, ja, nl, pl, pt-BR, ru, zh-CN, and zh-TW.

What’s new or coming up in SUMO

Firefox 73 is out but did not require updated localization and many articles were still valid from 72.

The most exciting event of January was All Hands in Berlin. Giulia has written a blog post on the SUMO journey experience at All Hands, you can read it at this link.

Regarding localization, we discussed a lot on how to keep the communication open with the community and there are going to be exciting news soon. Keep an eye on the forum!

Events

Want to showcase an event coming up that your community is participating in? Reach out to any l10n-driver, and we’ll include that (see links to emails at the bottom of this report)

Friends of the Lion

Do you know someone in your l10n community who’s been doing a great job and should appear here? Contact one of the l10n-drivers, and we’ll make sure they get a shout-out (see list at the bottom)!

Useful Links Questions? Want to get involved?

Did you enjoy reading this report? Let us know how we can improve by reaching out to any one of the l10n-drivers listed above.

Categorieën: Mozilla-nl planet

Support.Mozilla.Org: What’s happening on the SUMO Platform: Sprint updates

Mozilla planet - wo, 19/02/2020 - 21:06

So what’s going on with the SUMO platform? We’re moving forward in 2020 with new plans, new challenges and a new roadmap.

We’re continuing this year to track all development work in 2 week sprints. You can see everything that is currently being worked on and our current sprint here (please note: this is only a project tracking board, do not use it to file bugs, bugs should continue to be filed via Bugzilla)

In order to be more transparent about what’s going on we are starting a round of blog posts to summarize every sprint and plan for the next. We’ve just closed Sprint no. 3 of 2020 and we’re moving into Sprint no.4

What happened in the last two weeks?

During the last two weeks we have been working tirelessly together with our partner, Lincoln Loop, to get Responsive Redesign out the door. The good news is that we are almost done.

We have also been working on a few essential upgrades. Currently support.mozilla.org is running on Python 2.7 which is no longer supported. We have been working on upgrading to Python3.7 and the latest Django Long Term Support (LTS) version 2.2. This is also almost done and we are expecting to move into the QA and bug fixing phase.

What’s happening in the next sprint?

During the next two weeks we’re going to start wrapping up Responsive redesign as well as the Python/Django upgrade and  focus on QA and bug fixing. We’re also planning to finalize a Celery 4 upgrade.

The next big thing is the integration of Firefox Accounts. As of May 2019 we have been working towards using Firefox Accounts as the authentication system on support.mozilla.org  Since the first phase of this project was completed we have been using both login via Firefox Accounts as well as the old SUMO login. It is now time to fully switch to Firefox Accounts. The current plan is to do this mid-March but expect to see some communication about this later this week.

For more information please check out our roadmap and feel free to reach out if you have any questions

Categorieën: Mozilla-nl planet

Henri Sivonen: Always Use UTF-8 & Always Label Your HTML Saying So

Mozilla planet - wo, 19/02/2020 - 20:21

To avoid having to deal with escapes (other than for <, >, &, and "), to avoid data loss in form submission, to avoid XSS when serving user-provided content, and to comply with the HTML Standard, always encode your HTML as UTF-8. Furthermore, in order to let browsers know that the document is UTF-8-encoded, always label it as such. To label your document, you need to do at least one of the following:

  • Put <meta charset="utf-8"> as the first thing after the <head> tag.

    The meta tag, including its ending > character needs to be within the first 1024 bytes of the file. Putting it right after <head> is the easiest way to get this right. Do not put comments before <head>.

  • Configure your server to send the header Content-Type: text/html; charset=utf-8 on the HTTP layer.

  • Start the document with the UTF-8 BOM, i.e. the bytes 0xEF, 0xBB, and 0xBF.

Doing more than one of these is OK.

Answers to Questions

The above says the important bit. Here are answers to further questions:

Why Do I Need to Label UTF-8 in HTML?

Because HTML didn’t support UTF-8 in the very beginning and legacy content can’t be expected to opt out, you need to opt into UTF-8 just like you need to opt into the standards mode (via <!DOCTYPE html>) and to mobile-friedly layout (via <meta name="viewport" content="width=device-width, initial-scale=1">). (Longer answer)

Which Method Should I Choose?

<meta charset="utf-8"> has the benefit of keeping the label within your document even if you move it around. The main risk is that someone forgets that it needs to be within the first 1024 bytes and puts comments, Facebook metadata, rel=preloads, stylesheets or scripts before it. Always put that other stuff after it.

The HTTP header has the benefit that if you are setting up a new server that doesn’t have any old non-UTF-8 documents on it, you can configure the header once, and it works for all HTML documents on the server thereafter.

The BOM method has the problem that it’s too easy to edit the file in a text editor that removes the BOM and not notice that this has happened. However, if you are writing a serializer library and you are neither in control of the HTTP header nor can inject a tag without interfering with what your users are doing, you can make the serializer always start with the UTF-8 BOM and know that things will be OK.

Can I Use UTF-16 Instead?

Don’t. If you serve user-provided content as UTF-16, it is possible to smuggle content that becomes executable when interpreted as other encodings. This is a cross-site scripting vulnerability if the user uses a browser that allows the user to manually override UTF-16 with another encoding.

UTF-16 cannot be labeled via <meta charset>.

What about Plain Text?

The <meta charset="utf-8"> method is not available for plain text, but the other two are. In the case of plain text, the HTTP header is obviously Content-Type: text/plain; charset=utf-8 instead.

What about JavaScript?

If you’ve labeled your HTML as UTF-8, you don’t need to label your UTF-8-encoded JavaScript files, since by default they inherit the encoding from the document that includes them. However, to make your JavaScript robust when referenced form non-UTF-8 HTML you can use the UTF-8 BOM or the HTTP header, which is Content-Type: application/javascript; charset=utf-8 in the JavaScript case.

What about CSS?

If you’ve labeled your HTML as UTF-8, you don’t need to label your UTF-8-encoded CSS files, since by default they inherit the encoding from the document that includes them. However, to make your CSS robust when referenced form non-UTF-8 HTML you can use the UTF-8 BOM or the HTTP header, which is Content-Type: text/css; charset=utf-8 in the CSS case, or you can put @charset "utf-8"; as the very first thing in the CSS file.

What about XML (Including SVG)?

Unlabeled XML defaults to UTF-8, so you don’t need to label it.

What about JSON?

JSON must be UTF-8 and is processed as UTF-8, so there’s no labeling.

What about WebVTT?

WebVTT is always UTF-8, so there’s no labeling.

Categorieën: Mozilla-nl planet

Henri Sivonen: Why Supporting Unlabeled UTF-8 in HTML on the Web Would Be Problematic

Mozilla planet - wo, 19/02/2020 - 20:20

UTF-8 has won. Yet, Web authors have to opt in to having browsers treat HTML as UTF-8 instead of the browsers Just Doing the Right Thing by default. Why?

I’m writing this down in comprehensive form, because otherwise I will keep rewriting unsatisfactory partial explanations repeatedly as bug comments again and again. For more on how to label, see another writeup.

Legacy Content Won’t Be Opting Out

First of all, there is the “Support Existing Content” design principle. Browsers can’t just default to UTF-8 and have HTML documents encoded in legacy encodings opt out of UTF-8, because there is unlabeled legacy content, and we can’t realistically expect the legacy content to be actively maintained to add opt-outs now. If we are to keep supporting such legacy content, the assumption we have to start with is that unlabeled content could be in a legacy encoding.

In this regard, <meta charset=utf-8> is just like <!DOCTYPE html> and <meta name="viewport" content="width=device-width, initial-scale=1">. Everyone wants newly-authored content to use UTF-8, the No-Quirks Mode (better known as the Standards Mode), and to work well on small screens. Yet, every single newly-authored HTML document has to explicitly opt in to all three, since it isn’t realistic to get all legacy pages to opt out.

Web Content Arrives over Time

But there is no single legacy encoding, so if we want to Support Existing Content, we need some way of deciding which one, and we know that given a document that is valid UTF-8, the probability that it was meant to be something other than UTF-8 is virtually zero. So if we decide which one of the legacy encodings we are dealing with not just by the top-level domain name (or the browser UI locale) but by examining the content, why not autodetect UTF-8?

The issue is not the difficulty of distinguishing UTF-8 from other encodings given the full content. In fact, when loading files from file: URLs, Firefox does detect detect UTF-8! (Chrome does, too, but less reliably.) For file: URLs, we sacrifice incremental loading on the assumption that most file: URLs point to a local disk (as opposed to a file server mounted as if it was a local drive) which is fast enough that the user would not notice incremental loading anyway. We also assume that file:-URL content is finite.

For http:/https: content, though, incremental processing is important and starting over is bad. Also, some pages intentionally never finish loading and need to be treated as infinite so we never have “full” content!

Encoding Detection Prescan Is Not Like meta charset Prescan

But we already wait for up to 1024 bytes (in Gecko; in WebKit and in Blink it is more complicated) to scan for meta charset, so infinite-loading pages that neither declare the encoding nor send 1024 bytes before some earlier JavaScript has done an out-of-band request to the server to signal that it is OK to send more HTML bytes already stall. Can’t we just scan the first 1024 bytes for UTF-8ness?

This assumes that there is some non-ASCII within the first 1024 bytes. Can we rely on non-ASCII pages to have the first bytes of non-ASCII within the first 1024 bytes? No.

The non-markup bytes are typically either in the general-purpose HTML title element or in the content attribute of the Facebook-purpose meta property="og:title" element. Sadly, it is all too possible for these not to be within the first 1024 bytes, because before them, there are things like IE conditional comments, Facebook bogo-namespaces, a heap of rel=preloads, over a dozen icons for iOS, copyright-related comments, or just scripts and stylesheets declared first.

What If We Scanned the Whole head?

What if we scanned until the start of body like WebKit does for meta charset (leaving aside for the moment how confidently we can locate the start of body, which has an optional start tag before we start the real tokenization and tree building)? Surely title is somewhere in head, and the user cannot perceive incremental rendering until body starts anyway.

So now we see title while we’ve buffered up bytes and haven’t started the real encoding decoder, the tokenizer, or the tree builder. We can now detect from the content of title, right? For non-Latin scripts, yes. Even just the page title in a non-Latin script is very likely enough to decide UTF-8ness. For languages like German or Finnish, no. Even though just about every German or Finnish document has non-ASCII, there’s a very real chance that the few words that end up in the title are fully ASCII. For languages like English, Somali, Dutch, Indonesian, Swahili, Somali, or various Malay languages you have even less hope of there being non-ASCII in the title than with German or Finnish even though there might be non-ASCII quotation marks, dashes, or a rare non-ASCII letter (such as a rare letter with diaeresis or acute accent) in a full document. For the World-Wide Web, a solution needs to work for these languages, too, and not just for non-Latin-script languages.

Looking Further

OK, so it seems that something more complicated is needed. Let’s think of fundamental requirements:

If Web authors think they can get away with not declaring UTF-8, many, many Web authors are going to leave UTF-8 undeclared. Therefore, we need a solution that works reliably in 100% of the case or we’d make the Web Platform more brittle. Timeouts are by definition dependent on something other than the content, so any solution that hand-waves some problem away by adding a timeout would be unreliable in this sense. Likewise, solutions that depend on how HTML content maps to network protocol buffer boundaries are inherently unreliable in this sense.

Also, making UTF-8 work undeclared should not regress performance compared to labeled UTF-8. A performance regression large enough to make the aggregate user experience worse (especially on slow CPUs and Internet connections) but small enough not to be noticed by authors (especially on fast CPUs and fast Internet connections) would be particularly unfortunate.

This gives us the following basic requirements:

  • Must support existing unlabeled non-UTF-8 content.
  • Must be reliable for unlabeled UTF-8.
  • Must not break incremental rendering of HTML.
  • Must not involve timeouts.
  • Must not depend on network buffer boundaries.
  • Must not regress performance compared to labeled UTF-8.

Let’s look at what’s wrong with potential solutions. (As noted earlier, simply defaulting to UTF-8 without detection would fail to support existing unlabeled non-UTF-8 content.)

Buffer Until Decided

OK, how about we scan until we’ve seen enough non-ASCII to decide, then? This doesn’t work, because for ASCII-only content it would mean buffering all the way to the end of the document, and ASCII-only content is real on the Web. That is, this would break incremental rendering e.g. for English. Trying to hand-wave the problem away using timeout would fail the requirement not to have timeouts. Trying to have a limit based and byte count would make the solution unreliable e.g. English content that has a copyright sign in the page footer or for Dutch content that has a letter with the diaeresis further from the page start than whatever the limit is.

What Chrome Does but with UTF-8 Detection

Chrome already has detection for legacy encodings. How about detecting UTF-8 byte patterns, too?

This would not be at all reliable. Chrome’s detection is opportunistic from whatever bytes the HTML parser happens to have available up front. This means that the result not only depends on timing and network buffer boundaries but also fails to account for non-ASCII after a long ASCII prefix.

What Chrome Does but with UTF-8 as the Fallback

How about doing what Chrome does, but deciding UTF-8 if all the bytes available at the time of decision or ASCII?

This would break some existing unlabeled non-UTF-8 content with a long ASCII prefix. Additionally, the breakage would be dependent on timing and network buffer boundaries.

What Firefox Does but with UTF-8 Detection

Firefox already has detection for legacy encodings. How about detecting UTF-8 byte patterns, too?

Firefox has a solution that does not depend on timing or network buffer boundaries and that can deal with long ASCII prefixes. If the meta prescan of the first 1024 bytes fails, Firefox runs the encoding detector on those 1024 bytes taking into account the top-level domain as an additional signal. If those bytes are all ASCII (and don’t contain an ISO-2022-JP escape sequence), Firefox at that point decides from the top-level domain. Upon encountering the end of the stream, Firefox guesses again now taking into account all the bytes. If the second guess differs from the first guess, the page is reloaded using the result of the second guess.

(The above description does not apply to the .jp, .in, and .lk TLDs. .jp has a special-purpose detector that detects among Japanese encodings only and triggers the reload, if needed, as soon as the decision is possible. .in and .lk fall back to windows-1252 without detection to accommodate old font hacks.)

When there’s a 1024-byte (or longer) ASCII prefix, reloading the page would regress performance relative to labeling UTF-8. Also, there is the additional problem that side effects of scripts (e.g. outbound XHR/Fetch) could be observed twice.

What Firefox Does but Guessing UTF-8 If the First 1024 Bytes Are ASCII

How about guessing UTF-8 instead of making a TLD-based guess when the first 1024 bytes are ASCII?

This solution would be better, but it would regress performance in the form of reloads for existing pages that currently don’t suffer such problems in order to allow UTF to go undeclared for new pages. Furthermore, pages that load different-origin pages into iframes could be confused by those pages reloading on their own. Sure, this problem is already present in Firefox, but it occurs rarely thanks to the TLD-based guess being pretty good except for non-windows-1252 content on generic domains. This solution would make it occur for every unlabeled non-UTF-8 page with a 1024-byte ASCII prefix. Moreover, this would break legacy-encoded documents that never reach the end of the stream, such as pre-Web Socket chat response iframes.

Even for new unlabeled UTF-8 pages that would be a performance penalty relative to labeled UTF-8: The performance cost of processing all the bytes of the page using the detector.

Stopping the Detector Once Confident about UTF-8

Could we do something about the performance penalty for unlabeled UTF-8 content?

Yes, we could. First, the ASCII prefix is already skipped over using SIMD and without pushing to each detector state machine. We could define how many characters of given UTF-8 sequence length need to be seen in order to stay with UTF-8 and stop running the detector. In the case of two-byte UTF-8 sequences, seeing only one is not enough. In the case of three-byte UTF-8 sequences, maybe even one is enough. This would mitigate the concern of unlabeled UTF-8 suffering a performance penalty relative to labeled UTF-8.

However, this would still leave the issue of reloading non-UTF-8 pages that presently don’t need to be reloaded thanks to the TLD-based guess and the issue of breaking legacy-encoded pages that intentionally never reach the end of the stream.

Passing Through the ASCII Prefix

What’s deal with the reloading anyway? An ASCII prefix decodes the same in both UTF-8 and in legacy encodings (other than UTF-16BE and UTF-16LE, which are handled on the BOM sniffing layer), so why not just pass the ASCII prefix through and make the detection decision afterwards?

That is, instead of treating decoding as a step that happens after detection, how about fusing the detector into a decoder such that the decoder streams ASCII through (to the HTML tokenizer) until seeing an ISO-2022-JP escape or a non-ASCII byte, and in the former case turns into a streaming ISO-2022-JP decoder immediately and in the latter case buffers bytes until the fused detector has confidently made its guess, turns into a decoder for the guessed encoding, outputs the buffer decoded accordingly, and thereafter behaves as a streaming decoder for the guessed encoding?

As with the observation that detecting UTF-8 is simple given access to the whole document, but things being complicated because document loading on the Web happens over time, things with the ASCII prefix are more complicated than they seem.

If the ASCII prefix is passed through to the HTML tokenizer, parsed, and the corresponding part of the DOM built before the encoding is decided, two issues need to be addressed:

  1. The ASCII prefix may contain <script src>, <link rel=stylesheet>, or same-origin <iframe>, and the encoding of the document inherits into those in case they turn out to lack encoding declarations of their own.
  2. A script may have observed document.characterSet.

Does the second issue matter? Maybe it does, in which case passing through the ASCII prefix before deciding the encoding won’t work. However, more likely it doesn’t.

If it doesn’t, we can make up a special name signifying ongoing detection and expose it from document.characterSet and inherit it into external scripts, stylesheets, and same-origin iframes. This means that detection expands from being an HTML loading-specific issue to being something that the script and style loaders need to deal with as well (i.e. they need to also run the detector if the special name is inherited).

If we were to go this route, we should use pre-existing IE special names. The generic detector should be called _autodetect_all and the .jp TLD-specific detector should be called _autodetect. (IE got Japanese detection in IE4. The generic detector was not in IE4 but was added by IE6 at the latest. Hence the Japanese case getting the shorter name.)

In addition to exposing non-encoding-name values via document.characterSet and making detection spill over to the script and style loaders, this poses a problem similar to the earlier ASCII prefix problems: What if there’s a two-byte UTF-8 sequence, which on its own could be plausible as two German windows-1252 characters or as a single legacy CJK character, and then another long stretch of ASCII? For example, UTF-8 ®, which is reasonable in an English page title, maps to 庐 in GBK, 簧 in Big5, 速 in EUC-JP, and 짰 in EUC-KR. The characters land in the most common section (Level 1 Hanzi/Kanji or common Hangul) in each of the four encodings.

So if UTF-8 ® stops ASCII passthrough and starts buffering, because the character alone isn’t a conclusive sign of UTF-8ness, it is easy to break incremental rendering, since on an English page buffering until more non-ASCII characters are found could end up reaching the end of the stream.

ASCII Pass-Through with Length-Limited Subsequent ASCII Runs

The problem could be alleviated in a way that doesn’t depend on timing or on buffer boundaries. If the page indeed is German in windows-1252 or Chinese, Japanese, or Korean in a legacy encoding, there should be more UTF-8 byte sequences at a shorter distance from the previous one than in UTF-8 English (or Dutch, etc.). The German non-ASCII sequences will be relatively far apart, but it’s very improbable that the next occurrence of windows-1252 non-ASCII will also constitute a valid UTF-8 byte sequence. GBK, Big5, EUC-JP, and EUC-KR can easily have multiple consecutive two-byte sequences that are also plausible UTF-8 byte sequences. However, once non-ASCII starts showing up, more non-ASCII is relatively close and at some point, there will be a byte sequence that’s not valid UTF-8.

It should be possible to pick a number such that if the detector has seen non-ASCII but hasn’t yet decided UTF-8 vs. non-UTF-8, if it subsequently sees more ASCII bytes in a row than the chosen number, it decides UTF-8.

It’s a Design! Why Not Go Ahead and Ship It?

Apart from making the bet that exposing weird values from document.characterSet wouldn’t break the Web, the solution sketched above would involve behaviors that none of Gecko, Blink, or WebKit currently have. Just letting Web authors omit labeling UTF-8 does not seem like a good enough reason to introduce such complexity.

What about text/plain?

text/plain can’t use <meta charset=utf-8> and doesn’t have the issue of re-running the side effects of JavaScript upon reload. How about making Firefox detect UTF-8 for text/plain?

The case against doing this is less strong than in the HTML case. However, it’s a slippery slope. It would be bad for Firefox to do this unilaterally and to provoke Chrome to do more detection if it meant Chrome picking one of the easy-for-Chrome brittle options from the start of the above list instead of doing something robust and cross-vendor-agreed-upon.

Categorieën: Mozilla-nl planet

Mozilla Open Policy & Advocacy Blog: The new EU digital strategy: A good start, but more to be done

Mozilla planet - wo, 19/02/2020 - 13:27

In a strategy and two white papers published today, the Commission has laid out its vision for the next five years of EU tech policy: achieving trust by fostering technologies working for people, a fair and competitive digital economy, and a digital and sustainable society. This vision includes big ambitions for content regulation, digital competition, artificial intelligence, and cybersecurity. Here we give some recommendations on how the Commission should take it forward.

We welcome this vision the Commission sketches out and are eager to contribute, because the internet today is not what we want it to be. A rising tide of illegal and harmful content, the pervasiveness of the surveillance economy, and increased centralisation of market power have damaged the internet’s original vision of openness. We also believe that innovation and fundamental rights are complementary and should always go hand in hand – a vision we live out in the products we build and the projects we take on. If built on carefully, the strategy can provide a roadmap to address the many challenges we face, in a way that protects citizens’ rights and enhances internet openness.

However, it’s essential that the EU does not repeat the mistakes of the past, and avoids misguided, heavy handed and/or one-size-fits-all regulations. The Commission should look carefully at the problems we’re trying to solve, consider all actors impacted and think innovatively about smart interventions to open up markets and protect fundamental rights. This is particularly important in the content regulation space, where the last EU mandate saw broad regulatory interventions (e.g. on copyright or terrorist content) that were crafted with only the big online platforms in mind, undermining individuals’ rights and competition. Yet, and despite such interventions, big platforms are not doing enough to tackle the spread of illegal and harmful content. To avoid such problematic outcomes, we encourage the European Commission to come up with a comprehensive framework for ensuring that tech companies really do act responsibly, with a focus on the companies’ practices and processes.

Elsewhere we are encouraged to see that the Commission intends on evaluating and reviewing EU competition rules to ensure that they remain fit for purpose. The diminishing nature of competition online and the accelerating trend towards web centralisation in the hands of a few powerful companies goes against the open and diverse internet ecosystem we’ve always fought for. The nature of the networked platform ecosystem is giving rise to novel competition challenges, and it is clear that the regulatory toolbox for addressing them is not fit-for-purpose. We look forward to working with EU lawmakers on how EU competition policy can be modernised, to take into account bundling, vertical integration, the role of data silos, and the potential of novel remedies.

We’re also happy to see the EU take up the mantle of AI accountability and seek to be a standard-setter for better regulation in this space. This is an area that will be of crucial importance in the coming years, and we are intent on shaping a progressive, human-centric approach in Europe and beyond.

The opportunity for EU lawmakers to truly lead and to set the future of tech regulation on the right path is theirs for the taking. We are eager and willing to help contribute and look forward to continuing our own work to take back the web.

The post The new EU digital strategy: A good start, but more to be done appeared first on Open Policy & Advocacy.

Categorieën: Mozilla-nl planet

Karl Dubost: Week notes - 2020 w07 - worklog - flask blueprint

Mozilla planet - wo, 19/02/2020 - 08:10

A string of secondary issues have been plaguing our restart of anonymous reporting on webcompat.com.

new anonymous workflow reporting.
  1. A bug is reported anonymously
  2. We send the data to a private repository (waiting for moderation)
  3. We put a placeholder on the public repository, saying that this will be moderated later on.
  4. In the private repo, the moderators can either:
    • set the milestone to accepted in the private repo and the public moderation placeholder will be replaced with the real issue content.
    • close the issue in the private repo (means it has been rejected) and it will replace the public moderation placeholder by another message saying it was rejected.

Simple! I had forgotten to handle the case of private issue with milestone accepted being closed. This erased a valid moderated issue. Not good. So we fixed it. This is now working.

from string to boolean in python

There was a solution to the issue we had last week about our string which is not a boolean: strtobool. Thanks to rik. Implementation details. Values include on and off. Neat!

coverage and pytest

In the process of trying to improve the project, I looked at the results of coverage on the project. I was pleasantly surprised for some areas of the code. But I also decided to open a couple of issues related to other parts. The more and better tests we have, the more robust the project will be.

While running coverage, I also stumbled upon this sentence in the documentation:

Nose has been unmaintained for a long time. You should seriously consider adopting a different test runner.

So I decided to create an issue specific on switching from nosetests to pytest.

And I started to work on that. It led to an interesting number of new breakages and warnings. First pytest is working better with an installable code.

pip install -e .

So I created a very simple and basic setup.py

then I ran to an issue that has bitten me in the past: flask blueprint.

Do NOT name the module, the directory and the blueprint with the same name.

Basically our code has this kind of constructs. subtree to make it simpler.

-- webcompat |-- __init__.py |-- form.py |-- api | |-- __init__.py | |-- uploads.py | |-- endpoints.py … |-- helpers.py |-- views.py

so in webcompat/__init__.py

from webcompat.api.endpoints import api app = Flask(__name__, static_url_path='') app.register_blueprint(api)

and in webcompat/api/endpoints/__init__.py

from webcompat.helpers import cool_feature api = Blueprint('api', __name__, url_prefix='/api') @api.route('blah') def somewhere(foo): """blablah""" yeah = cool_feature()

So what is happening here? The module and the blueprint share the same name. So if in a test we need to mock cool_feature:

with patch('webcompat.api.endpoints.cool_feature') as mock_cool:

We need to remember that when mocking, we do not mock the feature where it has been defined (aka webcompat.helpers.cool_feature) but where it has been imported (aka webcompat.api.endpoints.cool_feature). We will not be able to mock in this case because there will be a conflict of names. The error will be:

E AttributeError: 'Blueprint' object has no attribute 'endpoints'

because the named webcompat.api blueprint has no attribute endpoints while the module webcompat.api has one.

So I will need to fix this next week.

changing circleCI

I also needed to changed CircleCI configuration to be able to run with pytest, even if it breaks for now.

Friday : diagnosis.

Friday I did some diagnosis and I'll do next monday and probably tuesday too.

Miscellaneous
  • my keyboard is having another hiccup (this is irregular). I kind of cope with it until there is a new model in the size i want with the new keyboard.
  • left shift key not working 70% of the time
  • number 2 (repeating itself 20% of time)
  • letter m (repeating itself 50% of time)
  • Coronavirus is hitting hard the boat. And some cases pop up here and there without apparent reasons sometimes. I minimize going out of home. Our local hospital has some infected patients. The response from the japanese authorities seems to be to say the least… very strange.

Otsukare!

Categorieën: Mozilla-nl planet

The Mozilla Blog: Thank You, Ronaldo Lemos

Mozilla planet - di, 18/02/2020 - 18:20

Ronaldo Lemos joined the Mozilla Foundation board almost six years ago. Today he is stepping down in order to turn his attention to the growing Agora! social movement in Brazil.

Over the past six years, Ronaldo has helped Mozilla and our allies advance the cause of a healthy internet in countless ways. Ronaldo played a particularly important role on policy issues including the approval of the Marco Civil in Brazil and shaping debates around net neutrality and data protection. More broadly, he brought his experience as an academic, lawyer and active commentator in the fields of intellectual property, technology and culture to Mozilla at a time when we needed to step up on these topics in an opinionated way.

As a board member, Ronaldo also played a critical role in the development of Mozilla Foundation’s movement building strategy. As the Foundation evolved it’s programs over the  past few years, he brought to bear extensive experience with social movements in general — and with the open internet movement in particular. This was an invaluable contribution.

Ronaldo is the Director of the Institute for Technology & Society of Rio de Janeiro (ITSrio.org), Professor at the Rio de Janeiro State University’s Law School and Partner with the law firm Pereira Neto Macedo.

He recently co-founded a political and social movement in Brazil called Agora!. Agora! is a platform for leaders engaged in the discussion, formulation and implementation of public policies in Brazil. It is an independent, plural and non-profit movement that believes in a more humane, simple and sustainable Brazil — in an efficient and connected state, which reduces inequalities and guarantees the well-being of all citizens.

Ronaldo remains a close friend of Mozilla, and we’ll no doubt find ample opportunity to work together with him, ITS and Algora! in the future. Please join me in thanking Ronaldo for his tenure as a board member, and wishing him tremendous success in his new endeavors.

Mozilla is now seeking talented new board members to fill Ronaldo’s seat. More information can be found here: https://mzl.la/MoFoBoardJD

The post Thank You, Ronaldo Lemos appeared first on The Mozilla Blog.

Categorieën: Mozilla-nl planet

Mike Hoye: Dexterity In Depth

Mozilla planet - di, 18/02/2020 - 16:50

Untitled

I’m exactly one microphone and one ridiculous haircut away from turning into Management Shingy when I get rolling on stuff like this, because it’s just so clear to me how much this stuff matters and how little sense I might be making at the same time. Is your issue tracker automatically flagging your structural blind spots? Do your QA and UX team run your next reorg? Why not?

This all started life as a rant on Mastodon, so bear with me here. There are two empirically-established facts that organizations making software need to internalize.

The first is that by wide margin the most significant predictive indicator that there will be a future bug in a piece of software is the relative orgchart distance of the people working on it. People who are working on a shared codebase in the same room but report to different VPs are wildly more likely to introduce errors into a codebase than two people who are on opposite sides of the planet and speak different first languages but report to the same manager.

The second is that the number one predictor that a bug will be resolved is if it is triaged correctly – filed in the right issue tracker, against the right component, assigned to the right people – on the first try.

It’s fascinating that neither of the strongest predictive indicators of the most important parts of a bug’s lifecycle – birth and death – actually take place on the developers’ desk, but it’s true. In terms of predictive power, nothing else in the software lifecycle comes close.

Taken together, these facts give you a tools to roughly predict the effectiveness of collaborating teams, and by analyzing trends among bugs that are frequently re-assigned or re-triaged, can give you a lot of foresight into how, where and why a company need to retrain or reorganize those teams. You might have read Agile As Trauma recently, in which Dorian Taylor describes agile development as an allergic reaction to previously bad management:

The Agile Manifesto is an immune response on the part of programmers to bad management. The document is an expression of trauma, and its intellectual descendants continue to carry this baggage. While the Agile era has brought about remarkable advancements in project management techniques and development tools, it remains a tactical, technical, and ultimately reactionary movement.

This description is strikingly similar to – and in obvious tension with – Clay Shirky’s description of bureaucracy as the extractive mechanism of complexity and an allergic reaction to previous institutional screwups.

Bureaucracies temporarily suspend the Second Law of Thermodynamics. In a bureaucracy, it’s easier to make a process more complex than to make it simpler, and easier to create a new burden than kill an old one.

… which sounds an awful lot like the orgchart version of “It’s harder to read code than to write it”, doesn’t it?

I believe both positions are correct. But that tension scribes the way forward, I think, for an institutional philosophy that is responsive, flexible and empirically grounded, in which being deliberate about the scale, time, and importance of different feedback cycles gives an organization the freedom to treat scaling like a tool, that the signals of different contexts can inform change as a continuum between the macro and micro levels of organizational structure and practice. Wow, that’s a lot of words in a strange order, but hear me out.

It’s not about agile, or even agility. Agility is just the innermost loops, the smallest manifestation of a wide possible set of tightly-coupled feedback mechanisms. And outside the agile team, adjacent to the team, those feedback loops may or may not exist however much they need to, up and down the orgchart (though there’s not often much “down” left in the orgchart, I’ve noticed, where most agile teams live…) but more importantly with the adjacent and complementary functions that agile teams rely on.

It is self-evident that how teams are managed profoundly affects how they deliver software. But agile development (and every other modern developer-cult I’m aware of) doesn’t close that loop, and in failing to do so agile teams are reduced to justifying their continued existence through work output rather than informing positive institutional change. And I don’t use “cult” lightly, there; the current state of empirical evaluation of agile as a practice amounts to “We agiled and it felt good and seemed to work!” And feeling good and kinda working is not nothing! But it’s a long way from being anything more than that.

If organizations make software, then starting from a holistic view of what “development” and “agility” means and could be, looking carefully at where feedback loops in an organization exist, where they don’t and what information they circulate, all that suggests that there are reliable set of empirical, analytic tools for looking at not just developer practice, but the organizational processes around them. And assessing, in some measurable, empirical way, the real and sustainable value of different software development schools and methodologies.

But honestly, if your UX and QA teams aren’t informing your next reorg, why not?

Categorieën: Mozilla-nl planet

Hacks.Mozilla.Org: WebThings Gateway Goes Global

Mozilla planet - di, 18/02/2020 - 16:39

Today, we’re releasing version 0.11 of the WebThings Gateway. For those of you running a previous version of our Raspberry Pi build, you should have already received the update. You can check in your UI by navigating to Settings ➡ Add-ons.

Translations and Platforms

The biggest change in this release is our ability to reach new WebThings Gateway users who are not native English speakers. Since the release of 0.10, our incredible community has contributed 24 new language translations via Pontoon, Mozilla’s localization platform, with even more in the works! If your native (or favorite) language is still not available, we would love to have you contribute a translation.

WebThings Gateway UI in Japanese

Users are also now able to install WebThings Gateway in even more ways. We have packages for several Debian, Ubuntu, and Fedora Linux versions available on our releases page. In addition, there is a package for Arch Linux available on the AUR. All of these packages complement our existing Raspberry Pi and Docker images.

Experiments

In this release, we’ve made some changes to our two active experiments.

First, the logs experiment has been promoted! It is now a first-class citizen, enabled for all users. Logging allows you to track changes in property values for your devices over a time period, using interactive graphs.

Logs UI

In other news, we’ve decided to say goodbye to our experimental voice-based virtual assistant. While this was a fun experiment, it was never a practical feature. In our 0.12 release, the back-end commands API, which was used by the virtual assistant, will also be removed, so applications using that interface will need to be updated. Our preferred approach going forward is to have add-ons use the Web Thing API for everything, including voice interactions. Fear not, though. In addition to our Mycroft skill, people in the WebThings community have created multiple add-ons to allow you to interface with your gateway via voice, which are available for installation through Settings ➡ Add-ons.

Miscellaneous

In addition to the notable changes above, there are a host of other updates.

  • Users of our Raspberry Pi image can now disable automatic OTA (over the air) updates, if they so choose.
  • Users can now access the web interface on their local network via http://, so that they’re not faced with an ugly, scary security warning each time.
  • The Progressive Web App (PWA) should be much more stable and reliable now.
  • As always, there have been numerous bug fixes.
What Now?

We invite you to download the new WebThings Gateway 0.11 release and continue to build your own web things with the latest WebThings Framework libraries. If you already have WebThings Gateway installed on a Raspberry Pi, you can expect the Gateway to be automatically updated.

As always, we welcome your feedback on Discourse. Please submit issues and pull requests on GitHub. You can also now chat with us directly on Matrix, in #iot.

The post WebThings Gateway Goes Global appeared first on Mozilla Hacks - the Web developer blog.

Categorieën: Mozilla-nl planet

Mozilla GFX: Challenge: Snitch on the glitch! Help the Graphics team track down an interesting WebRender bug…

Mozilla planet - di, 18/02/2020 - 15:44

For the past little while, we have been tracking some interesting WebRender bugs that people are reporting in release. Despite best efforts, we have been unable to determine clear steps to reproduce these issues and have been unable to find a fix for them. Today we are announcing a special challenge to the community – help us track down steps to reproduce (a.k.a STR) for this bug and you will win some special, limited edition Firefox Graphics team swag! Read on for more details if you are interested in participating.

What we know so far about the bug:

Late last year we started seeing reports of random UI glitching bugs that people were seeing in release. You can check out some of the reports on Bugzilla. Here is what we know so far about this bug:

  • At seemingly random intervals, either two things seem to happen:
<figcaption>Glitches!</figcaption> <figcaption>Black boxes!</figcaption>
  • The majority of the reports we have seen so far have come from people using NVIDIA graphics cards, although we have seen reports come in of this happening on Intel and AMD as well. That could be though because the majority of the people we have officially shipped WR to in release are on NVIDIA cards.
  • There doesn’t seem to be one clear driver version correlated to this bug, so we are not sure if it is a driver bug.
  • All reporters so far have been using Windows 10
  • No one who has reported the bug thus far has been able to determine clear and consistent STR, and no one on the Graphics team has found a way to reproduce it either. We all use WebRender daily and none of us have encountered the bug.
How can you help?

Without having a way to reliably reproduce this bug, we are at a loss on how to solve it. So we decided to hold a challenge to engage the community further to help us understand this bug better. If you are interested in helping us get to the root of this tricky bug, please do the following:

  • Download Firefox Nightly (if you don’t already use it)
  • Ideally you are using Windows 10 (but if you see this bug on other platforms, we are interested in hearing about it!)
  • Ensure WebRender is enabled
    • Go to about:config and set gfx.webrender.all to true, then restart your browser
  • If you encounter the bug, help us by filing a bug in Bugzilla with the following details:
    • What website are you on when the bug happens?
    • Does it seem to happen when specific actions are taken?
    • How frequently does the bug happen and can you ‘make’ it happen?
    • Attach the contents of your about:support as a text file
  • The main thing we really need are consistent steps that result in the bug showing up. We will send some limited edition Graphics swag to the first 3 bug reporters who can give us consistent STR!

Even if you can’t easily find STR, we are still interested in hearing about whether you see this bug!

Challenge guidelines

The winners of this challenge will be chosen based on the following criteria:

  • The bug report contains clear and repeatable steps to make the bug happen
    • This can include things like having a specific hardware configuration, using certain add ons and browsing certain sites – literally anything as long as it can reliably and consistently cause the bug to appear
    • BONUS: A member of the Graphics team can follow your steps and can also make the bug appear
  • We will choose the first 3 reporters who can meet this criteria (we say 3 because it is possible there is more than one bug and more than one way to reproduce it)
  • Winners will receive special limited edition Graphics Team swag! (t-shirt and stickers)

Update: we have created the channel #gfx-wr-glitch:mozilla.org on Matrix so you can ask questions/chat with us there. For more info about how to joing Matrix, check out: https://wiki.mozilla.org/Matrix

Categorieën: Mozilla-nl planet

Wladimir Palant: Insights from Avast/Jumpshot data: Pitfalls of data anonymization

Mozilla planet - di, 18/02/2020 - 10:00

There has been a surprising development after my previous article on the topic, Avast having announced that they will terminate Jumpshot and stop selling users’ data. That’s not the end of the story however, with the Czech Office for Personal Data Protection starting an investigation into Avast’s practices. I’m very curious to see whether this investigation will confirm Avast’s claims that they were always fully compliant with the GDPR requirements. For my part, I now got a glimpse of what the Jumpshot data actually looks like. And I learned that I massively overestimated Avast’s success when anonymizing this data.

Conveyor belt putting false noses on avatars in a futile attempt of masking their identity

In reality, the data sold by Jumpshot contained plenty of user identifiers, names, email addresses, even home addresses. That’s partly due to Avast being incapable or unwilling to remove user-specific data as they planned to. Many issues are generic however and almost impossible to avoid. This once again underlines the central takeaway: anonymizing browser history data is very hard. That’s especially the case if you plan to sell it to advertisers. You can make data completely anonymous, but you will have to dumb it down so much in the process that advertisers won’t have any use for it any more.

Why did I decide to document Avast’s failure in so much detail? My goal is to spread appreciation for the task of data anonymization: it’s very hard to ensure that no conclusions about users’ identity are possible. So maybe whoever is toying with the idea of collecting anonymized data will better think twice whether they really want do go there. And maybe next time we see a vendor collecting data we’ll ask the right questions about how they ensure it’s a “completely anonymous” process.

Table of Contents The data

The data I saw was an example that Jumpshot provided to potential customers: an excerpt of real data for one week of 2019. Each record included an exact timestamp (milliseconds precision), a persistent user identifier, the platform used (desktop or mobile, which browser), the approximate geographic location (country, city and ZIP code derived from the user’s IP address), a guess for user’s gender and age group.

What it didn’t contain was “every click, on every site.” This data sample didn’t belong to the “All Clicks Feed” which has received much media attention. Instead, it was the “Limited Insights Pro Feed” which is supposed to merely cover user’s shopping behavior: which products they looked at, what they added to the cart and whether they completed the order. All of that limited to shopping sites and grouped by country (Germany, UK and USA) as well as product category such as Shoes or Men’s Clothing.

This doesn’t sound like there would be all too much personal data? But there is, thanks to a “referrer” field being there. This one is supposed to indicate how the user came to the shopping site, e.g. from a Google search page or by clicking an ad on another website. Given the detailed information collected by Avast, determining this referrer website should have been easy – yet Avast somehow failed this task. And so the supposed referrer is typically a completely unrelated random web page that this user visited, and sometimes not even a page but an image or JSON data.

If you extract a list of these referrers (which I did), you see news that people read, their web mail sessions, search queries completely unrelated to shopping, and of course porn. You get a glimpse into what porn sites are most popular, what people watch there and even what they search for. For each user, the “limited insights” actually contain a tiny slice of their entire browsing behavior. Over the course of a week this exposed way too much information on some users however, and Jumpshot customers watching users over longer periods of time could learn a lot about each user even without the “All Clicks Feed.”

What about anonymization?

Some parameters and address parts have been censored in the data. For example, you will see an entry like the following:

http://example.com/email/edit-details/[PII_EMAIL_abcdef1234567890]

A heuristic is at work here and will replace anything looking like an email address with a placeholder. Other heuristics will produce placeholders like [PII_USER_abcdef1234567890] and [PII_NM_abcdef1234567890] – these seem to be more rudimentary, applying based on parameter names. This is particularly obvious in entries like this one:

https://www.ancestry.co.uk/name-origin?surname=[PII_NM_abcdef1234567890]

Obviously, the surname parameter here is merely a search query. Given that search queries aren’t being censored elsewhere, it doesn’t make much sense to censor them here. But this heuristic isn’t terribly clever and cannot detect whether the parameter refers to the user.

Finally, the generic algorithm described in the previous article seems to apply, this one will produce placeholders like [PII_UNKWN_abcdef1234567890].

Failures to censor user-specific parameters

It isn’t a big surprise that heuristic approaches will miss some data. The generic algorithm seemed sane from its description in the patent however and should be able to recognize most user-specific data. In reality, this algorithm appears misimplemented, censoring only few of the relevant parameters and without an apparent system. So you will see addresses like the following without any censoring applied:

https://nolp.dhl.de/nextt-online-public/set_identcodes.do?zip=12345&idc=9876543210987654321

Residents of Germany will immediately recognize this as a DHL package tracking link. The idc parameter is the package identifier whereas the sometimes present zip parameter is the recipient’s ZIP code. And now you’d need to remember that DHL only requires you to know these two pieces of information to access the “detailed view,” the one that will show you the name of whoever received the package. Yes, now we have a name to associate the browsing history with. And even if the zip parameter isn’t in the tracking link – remember, the data contains a guess for it based on the user’s IP address, a fairly accurate one in fact.

Want more examples? Quite a few “referrers” are related to the authentication process of websites. A search for keywords like “oauth”, “openid” or “token” will produce lots of hits, usually without any of the parameters being censored. Worst-case scenario here: somebody with access to Jumpshot data could hijack an already authenticated session and impersonate this user, allowing them to access and modify user’s data. One has to hope that larger websites like Facebook and Google use short enough expiration intervals that such attacks would be impracticable for Jumpshot customers.

JWT tokens are problematic even under ideal conditions however. JWT is an authentication approach which works without server-side state, all the relevant information is encoded in the token itself. These tokens are easily found by searching for the “.ey” keyword. There are some issued by Facebook, AOL, Microsoft and other big names. And after reversing Base64 encoding you get something like:

{"instanceId":"abcd1234","uid":12345,"nonce":"dcba4321","sid":"1234567890"}

Most values here are implementation-specific and differ from site to site. But usually there is some user identifier, either a numerical one (can likely be converted into a user name somewhere on the website), occasionally an email or even an IP address. It also often contains tokens related to the user’s session and potentially allowing hijacking it: session identifier, nonce, sometimes even OAuth tokens.

Last but not least, there is this:

https://mail.yandex.ru/u1234/?uid=987654321&login=myyandexname#inbox

This address also wasn’t worth censoring for Avast. Now I never used Yandex Mail but I guess that this user’s email address is myyandexname@yandex.ru. There are quite a few addresses looking like this, most of them contain only the numerical user identifier however. I strongly suspect that some Yandex service or API allows translating these numerical IDs into user names however, thus allowing to deduce user’s email address.

Shortcomings of the heuristics

Now let’s have a look at the heuristic removing email addresses, the last line of defense. This one will reliably remove any URL-encoded email addresses, so you won’t find anything like me%40example.com in the data. But what about unusual encodings? Heuristics aren’t flexible, so these won’t be detected.

It starts with the obvious case of URL encoding applied twice: me%2540example.com. The Avast data contains plenty of email addresses encoded like this, for example:

https://m.facebook.com/login.php?next=https%3A%2F%2Fm.facebook.com%2Fn%2F %3Fthread_fbid%3D123456789%26source%3Demail%26cp%3Dme%2540example.com

Did you notice what happened here? The email address isn’t a parameter to Facebook’s login.php. The only parameter here is next, it’s the address to navigate to after a successful login. And that address just happens to contain the user’s email address as a parameter, for whatever reason. Hence URL encoding applied twice.

Another scenario:

https://www.google.com/url?q=http://example.com/ confirm?email%3dme%2540example.com&source=gmail

What’s that, a really weird Google query? The source=gmail parameter indicates that it isn’t, it’s rather a link that somebody clicked in Gmail. Apparently, Gmail will will send such links as “queries” to the search engine before the user is redirected to their destination. And the destination address contains the email address here, given how the link originated from an address confirmation email apparently. Links from newsletters will also frequently contain the user’s email address.

And then there is this unexpected scenario:

https://mail.yahoo.com/d/search/name=John%2520Smith&emailAddresses=me%2540example.com

I have no idea why search in Yahoo Mail will encode parameters twice but it does. And searches of Yahoo Mail users contain plenty of names and email addresses of the people they communicate with.

Note that I only mentioned the most obvious encoding approach here. Some websites encode their parameters using Base64 encoding for some reason, and these also contain email addresses quite frequently.

Where do these users live?

So far we have names, email and IP addresses. That’s interesting of course but where do these users actually live? Jumpshot data provides only a rough approximation for that. Luckily (or unfortunately – for the users), Google Maps is a wildly popular service, and so it is very present in the data. For example:

https://www.google.de/maps/@52.518283,13.3735008,17z

That’s a set of very precise geographical coordinates, could it be the user’s home? It could be, but it also might be a place where they wanted to go, or just an area that they looked at. The following entry is actually way more telling:

https://www.google.de/maps/dir/Platz+der+Republik+1,+10557+Berlin/ Museum+für+Kommunikation,+Leipziger+Straße,+Berlin/@52.5140286,13.3774848,16z

By Avast’s standards, a route planned on Google Maps isn’t personally identifiable information – any number of people could have planned the same route. However, if the start of the route is an address and the end a museum, a hotel or a restaurant, it’s a fairly safe bet that the address is actually the user’s home address. Even when it isn’t obvious which end of the route the user lives at, the ZIP code in the Jumpshot data helps one make an educated guess here.

And then you type “Platz der Republik 1, Berlin” into a search engine and in quite a few cases the address will immediately map to a name. So your formerly anonymous user is now called Angela Merkel.

Wasn’t it all aggregated?

In 2015 Avast’s then-CTO Ondřej Vlček promised:

These aggregated results are the only thing that Avast makes available to Jumpshot customers and end users.

Aggregation would combine data from multiple users into a single record, an approach that would make conclusions about individual users much harder. Sounds quite privacy-friendly? Unfortunately, Jumpshot’s marketing already cast significant doubt on the claims that aggregation is being used consistently.

What was merely a suspicion in my previous blog post is now a fact. I don’t want to say anything about Jumpshot data in general, I haven’t seen all of it. But the data I saw wasn’t aggregated at all, each record was associated with exactly one user and there was a unique user identifier to tell records from different users apart. Also, I’ve seen marketing material for the “All Clicks Feed” suggesting that this data isn’t aggregated either.

The broken promises here aren’t terribly surprising, aggregated data is much harder to monetize. I already quoted Graham Cluley before with his prediction from 2015:

But let’s not kid ourselves. Advertisers aren’t interested in data which can’t help them target you. If they really didn’t feel it could help them identify potential customers then the data wouldn’t have any value, and they wouldn’t be interested in paying AVG to access it.

Conclusions

I looked into a week’s worth of data from a “limited insights” product sold by Jumpshot and I was already able to identify a large number of users, sometimes along with their porn watching habits. The way this data was anonymized by Avast is insufficient to say the least. Companies with full access to the “every click, on every site” product were likely able to identify and subsequently stalk the majority of the affected users. The process of identifying users was easy to automate, e.g. by looking for double encoded email addresses or planned Google Maps routes.

The only remaining question is: why is it that Avast was so vehemently denying selling any personally identifiable data? Merely a few days before deciding to shut down Jumpshot Avast’s CEO Ondřej Vlček repeated in a blog post:

We want to reassure our users that at no time have we sold any personally identifiable information to a third party.

So far we only suspected, now we can all be certain that this statement isn’t true. To give them the benefit of doubt, how could they have not known? The issues should have been pretty obvious to anybody who took a closer look at the data. The whole scandal took months to unwind. Does this mean that throughout all that time Avast kept repeating this statement, giving it to journalists and spreading it on social media, yet nobody bothered to verify it? If we follow this line of thought then the following statement from the same blog post is clearly a bold lie:

The security and privacy of our users worldwide is Avast’s priority

I for my part removed all the raw and processed Jumpshot data in presence of witnesses after concluding this investigation. Given the nature of this data, this seems to be the only sensible course of action.

Categorieën: Mozilla-nl planet

Emily Dunham: Git: moving a module into a monorepo

Mozilla planet - di, 18/02/2020 - 09:00
Git: moving a module into a monorepo

My team has a repo where we keep all our terraform modules, but we had a separate module off in its own repo for reasons that are no longer relevant.

Let’s call the modules repo git@github.com:our-org/our-modules.git. The module moving into it, let’s call it git@github.com:our-org/postgres-module.git, because it’s a postgres module.

First, clone both repos.:

git clone git@github.com:our-org/our-modules.git git clone git@github.com:our-org/postgres-module.git

I can’t just add postgres-module as a remote to our-modules and pull from it, because I need the files to end up in a subdirectory of our-modules. Instead, I have to make a commit to postgres-module that puts its files in exactly the place that I want them to land in our-modules. If I didn’t, the README.md files from both repos would hit a merge conflict.

So, here’s how to make that one last commit:

cd postgres-module mkdir postgres git mv *.tf postgres/ git mv *.md postgres/ git commit -m "postgres: prepare for move to modules repo" cd ..

Notice that I don’t push that commit anywhere. It just sits on my filesystem, because I’ll pull from that part of my filesystem instead of across the network to get the repo’s changes into the modules repo:

cd our-modules git remote add pg ../postgres-module/ git pull pg master --allow-unrelated-histories git remote rm pg cd ..

At this point, I have all the files and their history from the postgres module in the postgres directory of the our-modules repo. I can then follow the usual process to PR these changes to the our-modules remote:

cd our-modules git checkout -b import-pg-module git push origin import-pg-module firefox https://github.com/our-org/our-modules/pull/new/import-pg-module

We eventually ended up to skip importing the history on this module, but figuring out how to do it properly was still an educational exercise.

Categorieën: Mozilla-nl planet

This Week In Rust: This Week in Rust 326

Mozilla planet - di, 18/02/2020 - 06:00

Hello and welcome to another issue of This Week in Rust! Rust is a systems language pursuing the trifecta: safety, concurrency, and speed. This is a weekly summary of its progress and community. Want something mentioned? Tweet us at @ThisWeekInRust or send us a pull request. Want to get involved? We love contributions.

This Week in Rust is openly developed on GitHub. If you find any errors in this week's issue, please submit a PR.

Updates from Rust Community News & Blog Posts Crate of the Week

This week's crates are pointer-utils, a small library for working with pointers, and jlrs, a crate to call Julia from Rust.

Thanks to Vikrant for the suggestions!

Submit your suggestions and votes for next week!

Call for Participation

Always wanted to contribute to open-source projects but didn't know where to start? Every week we highlight some tasks from the Rust community for you to pick and get started!

Some of these tasks may also have mentors available, visit the task page for more information.

If you are a Rust project owner and are looking for contributors, please submit tasks here.

Updates from Rust Core

276 pull requests were merged in the last week

Approved RFCs

Changes to Rust follow the Rust RFC (request for comments) process. These are the RFCs that were approved for implementation this week:

Final Comment Period

Every week the team announces the 'final comment period' for RFCs and key PRs which are reaching a decision. Express your opinions now.

RFCs

No RFCs are currently in final comment period.

Tracking Issues & PRs New RFCs Upcoming Events Asia Pacific Europe North America

If you are running a Rust event please add it to the calendar to get it mentioned here. Please remember to add a link to the event too. Email the Rust Community Team for access.

Rust Jobs

Tweet us at @ThisWeekInRust to get your job offers listed here!

Quote of the Week

Option is null in different clothes, but the clothes that nulls wear are important.

skysch on rust-users

Thanks to Cerberuser for the suggestions!

Please submit quotes and vote for next week!

This Week in Rust is edited by: nasa42 and llogiq.

Discuss on r/rust.

Categorieën: Mozilla-nl planet

Mozilla Reps Community: Mozilla Reps in 2020 Berlin All Hands

Mozilla planet - ma, 17/02/2020 - 15:30

14 Reps were invited to participate in this year’s All Hands in Berlin.

At the All-Hands Reps learned some easy German words  (Innovationsprozess-swischenstands-schreihungsskizze), did some art (see here X artistic endeavor during a group activity), and learned about cultural differences in communication.

Reps graffiti in Berlin

Of course, it was not all going around, spray painting walls, and trying to understand the German sense of humor. The reps went down to serious Reps business.

To be sure that issues that were relevant to the Reps would be discussed in Berlin, all the invited Reps were asked to fill out a survey. The answers from the survey revealed a series of community issues that limit participation, as well as issues with the infrastructures of the Reps program. Putting together these answers with the themes that the Reps council worked on their OKRs last year,  these issues were prioritized for discussion:

  • campaign and activities,
  • communication between Mozilla and Reps,
  • how to grow and activate communities,
  •  mentors program, onboarding,
  • Reps program’s infrastructure.

 

So, to discuss how to tackle these issues and how to bring forward the program in 2020 the reps had meetings (lots and lots of meetings).

meetings in all hands

On Tuesday  Reps were divided into small groups to discuss the above-mentioned issues. In every group, they were asked to discuss the current state and to imagine the ideal state for each issue by the end of 2020.

On Wednesday Reps were asked, “are we community coordinators yet?”. We agreed to prioritize the themes we discussed on Tuesday based on the above question and thinking around which one of them can strengthen their role as a community coordinator and, therefore, help to grow healthy communities.

Based on that prioritization two themes emerged as the leading themes for 2020: communication between Mozilla and the reps program and campaign and activities. That, of course, doesn’t mean that the rest of the themes are not equally important. The Reps Council decided to put the focus on the first two while at the same time open the participation for the other themes amongst the Reps.

On  Wednesday the Reps Council met to discuss how to turn the 2 major topics into objectives and key results  of the program in 2020. This conversation is still going and the Reps Council will publish this year’s OKRs soon.

 

On Thursday all reps discussed the campaigns pipeline and how volunteers can contribute more actively to it.

 

So, what was the conclusion from this tour-de-force?

 

more meetings in berlin

Communications:

Regarding communication, the reps established that there was a  need for more consistent communication and transparency. So that they set this goal to be reached by the end of 2020:

 

  • By the end of 2020, Reps have all the information about Mozilla’s goals and direction and they have the effective tools and processes to pass that information to their communities.

 

More in detail this would mean, between other things, that there will be a clear channel of communication for Mozilla to reach reps, consistent communication around Mozilla’s top goals, and the capability to update communities in their own languages. This will require establishing resources dedicated to improving communication.

 

Activities: 

Reps established that there is a need to have more activities and campaigns in which communities can participate and that these campaigns need to be all in one place (the community portal). The objective is that:

 

  • By the end of 2020, Reps can suggest activities or campaigns to volunteers at any given time

 

Reps will also become more active in all the stages of a campaign pipeline, from the ideation to the implementation phase. To this end, communication and information between reps and projects/staff should become simpler.

Let us know what you think in the comments.

 

On behalf of the Community Development Team

Francesca and Konstantina

Categorieën: Mozilla-nl planet

Daniel Stenberg: curl ootw: –mail-from

Mozilla planet - ma, 17/02/2020 - 14:55

(older options of the week)

--mail-from has no short version. This option was added to curl 7.20.0 in February 2010.

I know this surprises many curl users: yes curl can both send and receive emails.

SMTP

curl can do “uploads” to an SMTP server, which usually has the effect that an email is delivered to somewhere onward from that server. Ie, curl can send an email.

When communicating with an SMTP server to send an email, curl needs to provide a certain data set and one of the mandatory fields of data that is necessary to provide is the email of the sender.

It should be noted that this is not necessary the same email as the one that is displayed in the From: field that the recipient’s email client will show. But it can be the same.

Example curl --mail-from from@example.com --mail-rcpt to@example.com smtp://mail.example.com -T body.txt Receiving emails

You can use curl to receive emails over POP3 or IMAP. SMTP is only for delivering email.

Related options

--mail-rcpt for the recipient(s), and --mail-auth for the authentication address.

Categorieën: Mozilla-nl planet

Zibi Braniecki: JavaScript Internationalization in 2020

Mozilla planet - vr, 14/02/2020 - 23:28

2020 is shaping up to be an amazing year for JavaScript Internationalization API.

After many years of careful design we’re seeing a lot of the work now coming close to completion with a number of high profile APIs on track for inclusion in ECMAScript 2020 standard!

What’s coming up?

Let’s cut to the chase. Here’s the list of APIs coming to the JavaScript environment of your choice:

Intl.RelativeTimeFormat // Create a relative time formatter in your locale // with default values explicitly passed in. const rtf = new Intl.RelativeTimeFormat("en", { localeMatcher: "best fit", // other values: "lookup" numeric: "always", // other values: "auto" style: "long", // other values: "short" or "narrow" }); // Format relative time using negative value (-1). rtf.format(-1, "day"); // > "1 day ago" // Format relative time using positive value (1). rtf.format(1, "day"); // > "in 1 day"

Intl.RelativeTimeFormat has been the first of the new APIs finished by the ECMA402 Work Group in a number of years.

Many UIs want to show date and time in a relative format because humans tend to understand such relative terms much better than absolute values.

This UX provides a foundation for a much better user experience that works across languages and cultures without requiring websites to ship their own data.

Status: The original requests comes from September 2015 and the proposal has been granted Stage 4 approval at the TC39 meeting in December 2019 and is now ready for inclusion into the spec.

Intl.Locale let loc = new Intl.Locale("pl-u-hc-h12", { calendar: "gregory" }); console.log(loc.language); // "pl" console.log(loc.hourCycle); // "h12" console.log(loc.calendar); // "gregory" console.log(loc.toString()); // "pl-u-ca-gregory-hc-h12"

Intl.Locale is a user-friendly implementation of the Unicode Locale Identifier part of UTS #35.

It brings ability to perform basic operations such as parsing, serializing, modifying and reading elements of locale identifiers.

This removes a common scenario where a developer is trying to dissect a language identifier string by slicing it manually.

It also allows users to construct or augment locales with options that then get passed to other Intl formatters:

let loc = new Intl.Locale("en-CA", { region: "US", hourCycle: "h24", }); let dtf = new Intl.DateTimeFormat(loc, { hour: "numeric" }); console.log(dtf.resolvedOptions().hourCycle); // "h24"

Status: Intl.Locale, originally proposed in September 2016, has been approved for Stage 4 at the last month’s TC39 meeting and is ready for landing into the main ECMA402 spec.

Intl.NumberFormat rev. 2 (299792458).toLocaleString("en-US", { style: "unit", unit: "meter-per-second", unitDisplay: "short" }); // "299,792,458 m/s"

Unified NumberFormat revision 2 is the very first case of a significant rewrite of an already existing JavaScript Intl API.

The motivation is to build on top of an excellent rewrite of the formatter that has been completed in ICU and enable much richer set of formatting operations on numbers.

It brings unit formatting, scientific notation, and control over sign display.

(987654321).toLocaleString("en-US", { notation: "scientific" }); // 9.877E8 (987654321).toLocaleString("en-US", { notation: "engineering" }); // 987.7E6 (987654321).toLocaleString("en-US", { notation: "compact", compactDisplay: "long" }); // 987.7 million (55).toLocaleString("en-US", { signDisplay: "always" }); // +55

All those features were added while preserving backward compatibility of the API, which is nothing short of amazing.

The new revision of the API lays foundation for internationalization of much broader range of use cases, and thanks to support for the formatToParts API it can be easily styled for display.

Status: The API combines a number of requests, most prominent of which was the request for Unit Formatting from September 2015. The proposal got approved for Stage 4 at the last months TC39 meeting and is ready for landing into the main ECMA402 spec.

Intl.ListFormat // Create a list formatter in your locale // with default values explicitly passed in. const lf = new Intl.ListFormat("en", { localeMatcher: "best fit", // other values: "lookup" type: "conjunction", // "conjunction", "disjunction" or "unit" style: "long", // other values: "short" or "narrow" }); lf.format(['Motorcycle', 'Truck' , 'Car']); // > "Motorcycle, Truck, and Car"

Intl.ListFormat provides basic capabilities of formatting a list of elements. In the most common scenario, it shows a conjunction list, but can also be used for disjunction lists and even unit lists.

const list = ['Motorcycle', 'Bus', 'Car']; console.log(new Intl.ListFormat('en-GB', { style: 'long', type: 'conjunction' }).format(list)); // > Motorcycle, Bus and Car console.log(new Intl.ListFormat('en-GB', { style: 'short', type: 'disjunction' }).format(list)); // > Motorcycle, Bus or Car console.log(new Intl.ListFormat('en-GB', { style: 'narrow', type: 'unit' }).format(list)); // > Motorcycle Bus Car

The unit option is particularly useful in conjunction with the Intl.NumberFormatter when working with lists of units and together can produce values such as 6m 10cm or 2h 35m.

Status: The API was originally requested in September 2015 and is currently in Stage 3. We intend to request TC39’s approval for Stage 4 over the next couple of months.

Intl.DateTimeFormat dateStyle/timeStyle let o = new Intl.DateTimeFormat("en" , { timeStyle: "short" }); console.log(o.format(Date.now())); // "13:31" let o = new Intl.DateTimeFormat("en" , { dateStyle: "short" }); console.log(o.format(Date.now())); // "21.03.2012" let o = new Intl.DateTimeFormat("en" , { timeStyle: "medium", dateStyle: "short" }); console.log(o.format(Date.now())); // "21.03.2012, 13:31"

Original design of the Intl.DateTimeFormat required developers to specify which fields in which style they want to be displayed, like this:

let date = new Date(Date.UTC(2012, 11, 20, 3, 0, 0, 200)); // request a weekday along with a long date let options = { weekday: 'long', year: 'numeric', month: 'long', day: 'numeric' }; console.log(new Intl.DateTimeFormat('de-DE', options).format(date)); // → "Donnerstag, 20. Dezember 2012"

While the flexibility this model provides, it is far from intuitive. It requires the developer to make specific decisions for each piece of the user interface increasing the odds of inconsistency.

The historical reasons for the formatter to require all fields to be listed are now gone and we can introduce a much simplified approach for most common scenarios:

let date = new Date(Date.UTC(2012, 11, 20, 3, 0, 0, 200)); // request a weekday along with a long date let options = { dateStyle: 'full', }; console.log(new Intl.DateTimeFormat('de-DE', options).format(date)); // → "Donnerstag, 20. Dezember 2012"

What’s even more important, the most popular formats of displaying time and date may differ per locale, and yet this approach locks down which fields in which style will be displayed for all of them.

Let’s see what happens when we format the dateStyle: "medium" in several locales:

let date = new Date(Date.UTC(2012, 11, 20, 3, 0, 0, 200)); let options = { dateStyle: 'medium', }; console.log(date.toLocaleString("pl", {dateStyle: "medium"})); // → "20 gru 2012" console.log(date.toLocaleString("ja-JP", {dateStyle: "medium"})); // → "2012/12/19" console.log(date.toLocaleString("he", {dateStyle: "medium"})); // → "19 בדצמ׳ 2012"

As you see, all three displayed medium length date, but styles of fields chosen were different. That would not be possible before and a developer would have to enforce their idea of what a medium style date is onto all locales. Now we can have more internationalized and easier to work with dates and times!

Status: The request from October 2016 is now in Stage 3 and we hope to bring it to the TC39 committee for approval for Stage 4 over the next couple months.

Intl.DisplayNames // Get display names of language in English var languageNames = new Intl.DisplayNames(['en'], {type: 'language'}); console.log(languageNames.of('fr')); // "French" console.log(languageNames.of('de')); // "German" console.log(languageNames.of('fr-CA')); // "Canadian French" console.log(languageNames.of('zh-Hant')); // "Traditional Chinese" console.log(languageNames.of('en-US')); // "American English" console.log(languageNames.of('zh-TW')); // "Chinese (Taiwan)"] // Get display names of script in English let scriptNames = new Intl.DisplayNames(['en'], {type: 'script'}); console.log(scriptNames.of('Latn')); // "Latin" console.log(scriptNames.of('Arab')); // "Arabic" console.log(scriptNames.of('Kana')); // "Katakana" // Get display names of region in English let regionNames = new Intl.DisplayNames(['en'], {type: 'region'}); console.log(regionNames.of('419')); // "Latin America" console.log(regionNames.of('BZ')); // "Belize" console.log(regionNames.of('US')); // "United States" console.log(regionNames.of('BA')); // "Bosnia & Herzegovina" console.log(regionNames.of('MM')); // "Myanmar (Burma)" // Get display names of currency code in English let currencyNames = new Intl.DisplayNames(['en'], {type: 'currency'}); console.log(currencyNames.of('USD')); // "US Dollar" console.log(currencyNames.of('EUR')); // "Euro" console.log(currencyNames.of('TWD')); // "New Taiwan Dollar" console.log(currencyNames.of('CNY')); // "Chinese Yuan"

In order to format date, time and other elements, JavaScript engine has to carry a lot of data for a lot of locales.

One of the most common request from developers working on making their JavaScript powered websites and applications internationalized, is to present a language selector.

In other cases, they’d like to format names of months, weekdays or currencies.

Intl.DisplayNames is a new API intended to expose translations for basic units used in other formatters.

While the scope of the proposal has been reduced for the initial revision, the second revision should bring us much awaited date and time related terms:

// Display names in English symbolNames = new Intl.DisplayNames( ['en'], {type: 'dateSymbol'}); symbolNames.of('saturday'); // => "Saturday" symbolNames.of('september'); // => "September" symbolNames.of('q1'); // => "1st quarter" symbolNames.of('pm'); // => "PM"

This API should end up being useful not just for language selectors, but also eventually for date/time pickers etc.

Status: The request originally made in September 2015, is now a proposal in Stage 3 and we hope to reach Stage 4 for the first revision this year.

Intl.DateTimeFormat.formatRage let date1 = new Date(Date.UTC(2007, 0, 10, 10, 0, 0)); let date2 = new Date(Date.UTC(2007, 0, 10, 11, 0, 0)); let fmt1 = new Intl.DateTimeFormat("en", { year: '2-digit', month: 'numeric', day: 'numeric', hour: 'numeric', minute: 'numeric' }); console.log(fmt1.format(date1)); console.log(fmt1.formatRange(date1, date2)); // > '1/10/07, 10:00 AM' // > '1/10/07, 10:00 – 11:00 AM'

The formatRange proposal extends Intl.DateTimeFormat with ability to format ranges of dates.

This feature is particularly useful for all date/time pickers, calendars, booking apps etc.

Status: The request from October 2017, is now in Stage 3 and we hope to finalize it this year.

What’s next?

We’re wrapping up and getting ready to tie all the loose ends and get the ECMAScript 2020 edition ready.

It’ll take a bit of time for the implementers to deploy and enable all of them, but with good specification polyfills, we should be able to start building on top of them already this year.

On top of that, we’re working on a number of proposals such as Intl.Segmenter, Intl.DurationFormat, and others including a major effort to design future localization system.

If you like what’s going on, and want to contribute to making JavaScript even better for writing multilingual software, check the issues in ecma402 repo, and consider joining our monthly meeting!

Credits

Such a major effort would not be possible without an excellent group of experts from all around the World.

The original ECMA402 1.0 was edited by Norbert Lindenberg based on the work by Nebojša Ćirić and Jungshik Shin and support from a large number of domain experts.

Many of the proposals that are getting stabilized today came to life when Caridy Patiño was the editor of ECMA402, and all of that ground work was only possible with the help from Eric Ferraiuolo.

Both of them enabled the large influx of proposals that came in 2015 and we were all lucky to see that matched by an influx of industry experts who came to aid us in the process – Allen Wirfs-Brock, Rick Waldron, André Bargull, Steven R. Loomis, Daniel Ehrenberg, Rafael Xavier, Shane Carr, Leo Balter, and Frank Tang.

On top of the core contributors, a large number of people provided input, feedback, criticism, participated in design discussions, implemented polyfills and provided implementers feedback.

It’s great to see that a large number of currently discussed issues and new proposals are driven by new members of the working group and I expect a high influx of ideas and proposals to come when all of the above mentioned APIs start being usable by the Web Platform users!

Categorieën: Mozilla-nl planet

The Firefox Frontier: Resolve data breaches with Firefox Monitor

Mozilla planet - vr, 14/02/2020 - 20:37

Corporate data breaches are an all too common reality of modern life. At best, you get an email from a company alerting you that they have been hacked, and then … Read more

The post Resolve data breaches with Firefox Monitor appeared first on The Firefox Frontier.

Categorieën: Mozilla-nl planet

Alessio Placitelli: Extending Glean: build re-usable types for new use-cases

Mozilla planet - vr, 14/02/2020 - 15:41
(“This Week in Glean” is a series of blog posts that the Glean Team at Mozilla is using to try to communicate better about our work. They could be release notes, documentation, hopes, dreams, or whatever: so long as it is inspired by Glean. You can find an index of all TWiG posts online.) The … →
Categorieën: Mozilla-nl planet

Mark Banner: ESLint now turned on for all of the Firefox/Gecko codebase

Mozilla planet - vr, 14/02/2020 - 11:08

About 4 years and 2 months ago, Dave Townsend and I landed a couple of patches on the Mozilla codebase that kick-started rolling out ESLint across our source code. Today, I’ve just landed the last bug in making it so that ESLint runs across our whole tree (where possible).

ESLint is a static analyser for JavaScript that helps find issues before you even run the code. It also helps to promote best practices and styling, reducing the need for comments in reviews.

Several Mozilla projects had started using ESLint in early 2015 – Firefox’s Developer Tools, Firefox for Android and Firefox Hello. It was clear to the Firefox desktop team that ESLint was useful and so we put together an initial set of rules covering the main desktop files.

Soon after, we were enabling ESLint over more of desktop’s files, and adding to the rules that we had enabled. Once we had the main directories covered, we slowly started enabling more directories and started running ESLint checks in CI allowing us to detect and back out any failures that were introduced. Finally, we made it to where we are today – covering the whole of the Firefox source tree, mozilla-central.

Along the way we’ve filed over 600 bugs for handling ESLint roll-out and related issues, many of these were promoted as mentored bugs and fixed by new and existing contributors – a big thank you to you all for your help.

We’ve also found and fixed many bugs as we’ve gone along. From small bugs in rarely used code, to finding issues in test suites where entire sections weren’t being run. With ESLint now enabled, it helps protect us against mistakes that can easily be detected but may be hard for humans to spot, and reduces the work required by both developer and reviewer during the code-review-fix cycle.

Although ESLint is now running on all the files we can, there’s still more to do. In a few places, we skipped enabling rules because it was easier to get ESLint just running and we also wanted to do some formatting on them. Our next steps will be to get more of those enabled, so expect some more mentored bugs coming soon.

Thank you again to all those involved with helping to roll out ESLint across the Firefox code-base, this has helped tremendously to make Firefox’s and Gecko’s source code to be more consistently formatted and contain less dead code and bugs. ESLint has also been extremely helpful in helping switch away from older coding patterns and reduce the use of risky behaviour during tests.

Categorieën: Mozilla-nl planet

Pagina's