Why I joined Mozilla’s Board of Directors
Over the past 20 years, I’ve focused on consumer-driven businesses. From factories making jeans (at Levi’s) to the largest retailer/company (Walmart) to early years of social-driven commerce (at ModCloth), getting to what drives consumer behavior is where I love to spend my time.
When I joined Choose Energy as an early employee and CEO, I hoped if we could consumerize electricity, we could change the grid. If I told you it was the cost difference of a latte a month to change to green power, it’d be a no-brainer. I was wrong. People don’t understand their bills (what’s a kWh?), what drives their bills (hint, it’s not the kids leaving the lights on), nor how to understand the cost (is 18¢/kWh a lot?). When you build a marketplace based on those things, it’s a challenge. Consumers know they *could* change their provider (for half of the US where you have the choice), but it’s probably item 8 on their to-do list, right behind rebalancing their 401k. It was my first exposure to what consumers say vs. what they do. Joining Rothy’s was a similar question. Was the brand about a sustainable shoe — 3D knit from recycled plastic water bottles — or comfort or a beautiful shoe? Short answer is it has to be a beautiful shoe first that consumers want to wear and then the sustainability aspect of it is a ‘nice to have’ and a reason to tell your friends about it.

This is a long beginning as to why Mozilla. When I was first approached about joining the Mozilla board, the recruiter asked me what I was looking for in new boards. My answer was short: something that matters and a board and management team where I could make a difference. Mozilla’s mission matters. And the time is now. In an era where Big Tech is less trusted and under intense scrutiny, Mozilla has a massive opportunity to have an impact on the future of an open, accessible internet.
This team is fantastic. From the MoCo board to the MoFo board to the long-time, authentic leadership from Mitchell Baker and the new(ish) voice of Steve Teixeira on where our products can go, this is a great team. Changing consumer behavior is hard. But critical. In the land of misinformation and content proliferation, social media, and the growth of AI, tech is changing our lives quickly. How we continue to build Mozilla’s voice in this changing landscape is an awesome challenge and one I’m excited to lend a voice to.
The post Why I joined Mozilla’s Board of Directors appeared first on The Mozilla Blog.
Introducing Mozilla.ai: Investing in trustworthy AI
We’re committing $30M to build Mozilla.ai: A startup — and a community — building a trustworthy, independent, and open-source AI ecosystem.
We’re only three months into 2023, and it’s already clear what one of the biggest stories of the year is: AI. AI has seized the public’s attention like Netscape did in 1994, and the iPhone did in 2007.
New tools like Stable Diffusion and the just-released GPT-4 are reshaping not just how we think about the internet, but also communication and creativity and society at large. Meanwhile, relatively older AI tools like the recommendation engines that power YouTube, TikTok and other social apps are growing even more powerful — and continuing to influence billions of lives.
This new wave of AI has generated excitement, but also significant apprehension. We aren’t just wondering What’s possible? and How can people benefit? We’re also wondering What could go wrong? and How can we address it? Two decades of social media, smartphones and their consequences have made us leery.
Mozilla has been asking these questions about AI for a while now — sketching out a vision for trustworthy AI, mobilizing our community to document what’s broken and investing in startups that are trying to create more responsible AI.
We’ve learned that this coming wave of AI (and also the last one) has tremendous potential to enrich people’s lives. But it will only do so if we design the technology very differently — if we put human agency and the interests of users at the core, and if we prioritize transparency and accountability. The AI inflection point that we’re in right now offers a real opportunity to build technology with different values, new incentives and a better ownership model.
The good news: We’ve met literally thousands of founders, engineers, scientists, designers, artists and activists who are taking this approach to AI. Smart, dedicated people are building open-source AI technology, testing out new approaches to auditing and figuring out how to build ‘trust’ into AI in the real world.
The less good news: We don’t see this happening amongst the big tech and cloud companies with the most power and influence. Meanwhile, these incumbents continue to consolidate their control over the market.
In short: Some people are starting to do things differently, but the most significant work (and investment) is happening the same old way. We want to change this.
So, today we are announcing Mozilla.ai: A startup — and a community — that will build a trustworthy and independent open-source AI ecosystem. Mozilla will make an initial $30M investment in the company.
The vision for Mozilla.ai is to make it easy to develop trustworthy AI products. We will build things and hire / collaborate with people that share our vision: AI that has agency, accountability, transparency and openness at its core. Mozilla.ai will be a space outside big tech and academia for like-minded founders, developers, scientists, product managers and builders to gather. We believe that this group of people, working collectively, can turn the tide to create an independent, decentralized and trustworthy AI ecosystem — a real counterweight to the status quo.
Mozilla.ai’s initial focus? Tools that make generative AI safer and more transparent. And, people-centric recommendation systems that don’t misinform or undermine our well-being. We’ll share more on these — and what we’re building — in the coming months.
This new company will be led by Managing Director Moez Draief. Moez has spent over a decade working on the practical applications of cutting-edge AI as an academic at Imperial College and LSE, and as a chief scientist in industry. Harvard’s Karim Lakhani, Credo’s Navrina Singh and myself will serve as the initial Board of Mozilla.ai.
Later this year, we will announce additional initiatives, partners and events where people can get involved. If you are interested in collaborating, reach out at hello@mozilla.ai.
The post Introducing Mozilla.ai: Investing in trustworthy AI appeared first on The Mozilla Blog.
Mozilla Launches Responsible AI Challenge
The last few months it has become clear that AI is no longer our future, but our present. Some of the most exciting ideas for the future of both the internet and the world involve AI solutions. This didn’t happen overnight, decades of work have gone into this moment. Mozilla has been working to make sure that the future of AI benefits humanity in the right ways by investing in the creation of trustworthy AI.
We want entrepreneurs and builders to join us in creating a future where AI is developed through this responsible lens. That’s why we are relaunching our Mozilla Builders program with the Responsible AI Challenge.
At Mozilla, we believe in AI: in its power, its commercial opportunity, and its potential to solve the world’s most challenging problems. But now is the moment to make sure that it is developed responsibly to serve society.
If you want to build (or are already building) AI solutions that are ambitious but also ethical and holistic, the Mozilla Builder’s Responsible AI Challenge is for you. We will be inviting the top nominees to join a gathering of the brightest technologists, business leaders and ethicists working on trustworthy AI to help get your ideas off the ground. Participants will also have access to mentorship from some of the best minds in the industry, the ability to meet key contributors in this community, and an opportunity to win some funding for their project.
Mozilla will be investing $50,000 into the top applications and projects, with a grand prize of $25,000 for the first place winner.
For more information, please visit here. Applications open up March 30, 2023.
The post Mozilla Launches Responsible AI Challenge appeared first on The Mozilla Blog.
Email protection just got easier in Firefox
If you’re already one of the many people who use Firefox Relay to save your real email address from trackers and spammers, then we’ve got a timesaver for you. We are testing a new way for Firefox Relay users to access their email masks directly from Firefox on numerous sites.
Since its launch, Firefox Relay has blocked more than 2.1 million unwanted emails from people’s inboxes while keeping real email addresses safe from trackers across the web. We’re always listening to our users, and one of the most-requested features is having Firefox Relay directly within the Firefox browser. And if you don’t already use Firefox Relay, you can always sign up.
How to use your Firefox Relay email masks in FirefoxIn the physical world, we limit sharing our home address. Yet, in the online world, we’re constantly asked for our email address and we freely share it with almost every site we come across. It’s our Firefox Relay users who think twice before sharing their email address, using email masks instead of their real email address to keep their personal information safe.
So, when a Firefox Relay user visits some sites in the Firefox browser and is prompted to sign up and share their email address, they can use one of their Firefox Relay email masks or create a new one. See how it works:
We hope to expand to more sites and to all Firefox users later this year.
Additionally, Firefox Relay users can also opt out of this new feature so that they’re no longer prompted to use an email mask when they come across the pop-up. If they want to manage their Firefox Relay email address masks, they can visit their dashboard on the Firefox Relay site.
Thousands of users have signed up for our smart, easy solution that hides their real email address to help protect their identity. Wherever you go online, Mozilla’s trusted products and services can help you feel safer knowing that you have privacy protection for your everyday online life.
If you don’t have Firefox Relay, you can subscribe today from the Firefox Relay site.

The post Email protection just got easier in Firefox appeared first on The Mozilla Blog.
Firefox Android’s new privacy feature, Total Cookie Protection, stops companies from keeping tabs on your moves
In case you haven’t heard, there’s an ongoing conversation happening about your personal data.
Earlier this year, United States President Biden said in his State of the Union address that there needs to be stricter limits on the personal data that companies collect. Additionally, a recent survey found that most people said they’d like to control the data that companies collect about them, yet they don’t understand how online tracking works nor do they know what they can do about it. Companies are now trying and testing ways to anonymize the third-party cookies that track people on the web or get consent for each site or app that wants to track people’s behavior across the web.
These days, who can you trust with your personal data? Mozilla. We have over a decade of anti-tracking work with products and features that protect people, their privacy and their online activity. Today, we’re announcing the official rollout of one of our strongest privacy features, Total Cookie Protection, to automatically block cross-site tracking on Firefox Android.
Yes, companies gather your data when you go from site to siteBefore we talk about Total Cookie Protection, let’s talk about cross-site tracking. These days our in-person transactions like shopping for groceries or buying gifts for friends have now become commonplace online. What people may not be aware of are the other transactions happening behind the scenes.
For example, as you’re shopping for a gift and going from site to site looking for the right one, your activity is being tracked without your consent. Companies use a specific cookie known as the third-party cookie, which gathers information about you and your browsing behavior and tracks you when you go from site to site. Companies use the information to build profiles and help them make ads targeted at convincing you to purchase, like resurfacing an item you were shopping for. So Mozilla created the feature Total Cookie Protection to block companies from gathering information about you and your browsing behavior.
Meet Firefox’s Total Cookie Protection, which stops cookies from tracking you around the web and is now available on Firefox Android. Last year, Firefox rolled out our strongest privacy feature, Total Cookie Protection across Windows, Mac and Linux. Total Cookie Protection works by maintaining a separate “cookie jar” for each website you visit. Any time a website, or third-party content embedded in a website, deposits a cookie in your browser, Firefox Android confines that cookie to the cookie jar assigned to that website. This way, no other websites can reach into the cookie jars that don’t belong to them and find out what the other websites’ cookies know about you. Now, you can say goodbye to those annoying ads following you and reduce the amount of information that companies gather about you whenever you go online.
Firefox’s Total Cookie Protection covers you across all your devicesWhether you’re browsing at your desk or your phone, now you’ll get Firefox’s strongest privacy protection to date. Firefox will confine cookies to the site where they were created, thus preventing tracking companies from using these cookies to track your browsing from site to site. To seamlessly work across your devices, sign up for a free Firefox Account. You’ll be able to easily pick up your last open tab between your devices. Bonus: You can also access your saved passwords from your other devices by signing up for a free Firefox Account.

For more on Firefox:
- Don’t “accept all cookies” until you’ve seen this video
- Mozilla Explains: Cookies and supercookies
- How Firefox’s Total Cookie Protection and container extensions work together
- Firefox rolls out Total Cookie Protection by default to all users worldwide
The post Firefox Android’s new privacy feature, Total Cookie Protection, stops companies from keeping tabs on your moves appeared first on The Mozilla Blog.
Will Kahn-Greene: Socorro Engineering: 2022 retrospective
2022 took forever. At the same time, it kind of flew by. 2023 is already moving along, so this post is a month late. Here's the retrospective of Socorro engineering in 2022.
Read more… (18 min remaining to read)
Will Kahn-Greene: Bleach 6.0.0 release and deprecation
Bleach is a Python library for sanitizing and linkifying text from untrusted sources for safe usage in HTML.
Bleach v6.0.0 released!Bleach 6.0.0 cleans up some issues in linkify and with the way it uses html5lib so it's easier to reason about. It also adds support for Python 3.11 and cleans up the project infrastructure.
There are several backwards-incompatible changes, hence the 6.0.0 version.
https://bleach.readthedocs.io/en/latest/changes.html#version-6-0-0-january-23rd-2023
I did some rough testing with a corpus of Standup messages data and it looks like bleach.clean is slightly faster with 6.0.0 than 5.0.0.
Using Python 3.10.9:
5.0.0: bleach.clean on 58,630 items 10x: minimum 2.793s
6.0.0: bleach.clean on 58,630 items 10x: minimum 2.304s
The other big change 6.0.0 brings with it is that it's now deprecated.
Bleach is deprecatedBleach sits on top of html5lib which is not actively maintained. It is increasingly difficult to maintain Bleach in that context and I think it's nuts to build a security library on top of a library that's not in active development.
Over the years, we've talked about other options:
find another library to switch to
take over html5lib development
fork html5lib and vendor and maintain our fork
write a new HTML parser
etc
With the exception of option 1, they greatly increase the scope of the work for Bleach. They all feel exhausting to me.
Given that, I think Bleach has run its course and this journey is over.
What happens now?Possibilities:
Pass it to someone else?
No, I won't be passing Bleach to someone else to maintain. Bleach is a security-related library, so making a mistake when passing it to someone else would be a mess. I'm not going to do that.
Switch to an alternative?
I'm not aware of any alternatives to Bleach. I don't plan to work on coordinating the migration for everyone from Bleach to something else.
Oh my goodness--you're leaving us with nothing?
Sort of.
I'm going to continue doing minimal maintenance:
security updates
support for new Python versions
fixes for egregious bugs (begrudgingly)
I'll do that for at least a year. At some point, I'll stop doing that, too.
I think that gives the world enough time for either something to take Bleach's place, or for the sanitizing web api to kick in, or for everyone to come to the consensus that they never really needed Bleach in the first place.

Bleach. Tired. At the end of its journey.
</figcaption> Thanks!Many thanks to Greg who I worked with on Bleach for a long while and maintained Bleach for several years. Working with Greg was always easy and his reviews were thoughtful and spot-on.
Many thanks to Jonathan who, over the years, provided a lot of insight into how best to solve some of Bleach's more squirrely problems.
Many thanks to Sam who was an indispensible resource on HTML parsing and sanitizing text in the context of HTML.
Where to go for moreFor more specifics on this release, see here: https://bleach.readthedocs.io/en/latest/changes.html#version-6-0-0-january-23rd-2023
Documentation and quickstart here: https://bleach.readthedocs.io/en/latest/
Source code and issue tracker here: https://github.com/mozilla/bleach
Wladimir Palant: Bitwarden design flaw: Server side iterations
In the aftermath of the LastPass breach it became increasingly clear that LastPass didn’t protect their users as well as they should have. When people started looking for alternatives, two favorites emerged: 1Password and Bitwarden. But do these do a better job at protecting sensitive data?
For 1Password, this question could be answered fairly easily. The secret key functionality decreases usability, requiring the secret key to be moved to each new device used with the account. But the fact that this random value is required to decrypt the data means that the encrypted data on 1Password servers is almost useless to potential attackers. It cannot be decrypted even for weak master passwords.
As to Bitwarden, the media mostly repeated their claim that the data is protected with 200,001 PBKDF2 iterations: 100,001 iterations on the client side and another 100,000 on the server. This being twice the default protection offered by LastPass, it doesn’t sound too bad. Except: as it turns out, the server-side iterations are designed in such a way that they don’t offer any security benefit. What remains are 100,000 iterations performed on the client side, essentially the same protection level as for LastPass.
Mind you, LastPass isn’t only being criticized for using a default iterations count that is three time lower than the current OWASP recommendation. LastPass also failed to encrypt all data, a flaw that Bitwarden doesn’t seem to share. LastPass also kept the iterations count for older accounts dangerously low, something that Bitwarden hopefully didn’t do either (Edit: yes, they did this, some accounts have considerably lower iteration count). LastPass also chose to downplay the breach instead of suggesting meaningful mitigation steps, something that Bitwarden hopefully wouldn’t do in this situation. Still, the protection offered by Bitwarden isn’t exactly optimal either.
Edit (2023-01-23): Bitwarden increased the default client-side iterations to 350,000 a few days ago. So far this change only applies to new accounts, and it is unclear whether they plan to upgrade existing accounts automatically. And today OWASP changed their recommendation to 600,000 iterations, it has been adjusted to current hardware.
Edit (2023-01-24): I realized that some of my concerns were already voiced in Bitwarden’s 2018 Security Assessment. Linked to it in the respective sections.
Contents- How Bitwarden protects users’ data
- What this means for decrypting the data
- What this means for you
- Is Bitwarden as bad as LastPass?
- How server-side iterations could have been designed
Like most password managers, Bitwarden uses a single master password to protect users’ data. The Bitwarden server isn’t supposed to know this password. So two different values are being derived from it: a master password hash, used to verify that the user is allowed to log in, and a key used to encrypt/decrypt the data.

If we look at how Bitwarden describes the process in their security whitepaper, there is an obvious flaw: the 100,000 PBKDF2 iterations on the server side are only applied to the master password hash, not to the encryption key. This is pretty much the same flaw that I discovered in LastPass in 2018.
What this means for decrypting the dataSo what happens if some malicious actor happens to get a copy of the data, like it happened with LastPass? They will need to decrypt it. And for that, they will have to guess the master password. PBKDF2 is meant to slow down verifying whether a guess is correct.
Testing the guesses against the master password hash would be fairly slow: 200,001 PBKDF2 iterations here. But the attackers wouldn’t waste time doing that of course. Instead, for each guess they would derive an encryption key (100,000 PBKDF2 iterations) and check whether this one can decrypt the data.
This simple tweak removes all the protection granted by the server-side iterations and speeds up master password guessing considerably. Only the client-side iterations really matter as protection.
What this means for youThe default protection level of LastPass and Bitwarden is identical. This means that you need a strong master password. And the only real way to get there is generating your password randomly. For example, you could generate a random passphrase using the diceware approach.
Using a dictionary for 5 dice (7776 dictionary words) and picking out four random words, you get a password with slightly over 50 bits of entropy. I’ve done the calculations for guessing such passwords: approximately 200 years on a single graphics card or $1,500,000.
This should be a security level sufficient for most regular users. If you are guarding valuable secrets or are someone of interest for state-level actors, you might want to consider a stronger password. Adding one more word to your passphrase increases the cost of guessing your password by factor 7776. So a passphrase with five words is already almost unrealistic to guess even for state-level actors.
All of this assumes that your KDF iterations setting is set to the default 100,000. Bitwarden will allow you to set this value as low as 5,000 without even warning you. This was mentioned as BWN-01-009 in Bitwarden’s 2018 Security Assessment, yet there we are five years later. Should your setting be too low, I recommend fixing it immediately. Reminder: current OWASP recommendation is 310,000.
Is Bitwarden as bad as LastPass?So as it turns out, with the default settings Bitwarden provides exactly the same protection level as LastPass. This is only part of the story however.
One question is how many accounts have a protection level below the default configured. It seems that before 2018 Bitwarden’s default used to be 5,000 iterations. Then the developers increased it to 100,000 in multiple successive steps. When LastPass did that, they failed upgrading existing accounts. I wonder whether Bitwarden also has older accounts stuck on suboptimal security settings.
The other aspect here is that Dmitry Chestnykh wrote about Bitwarden’s server-side iterations being useless in 2020 already, and Bitwarden should have been aware of it even if they didn’t realize how my research applies to them as well. On the other hand, using PBKDF2 with only 100,000 iterations isn’t a great default today. Still, Bitwarden failed to increase it in the past years, apparently copying LastPass as “gold standard” – and they didn’t adjust their PR claims either:

Users have been complaining and asking for better key derivation functions since at least 2018. It was even mentioned as BWN-01-007 in Bitwarden’s 2018 Security Assessment. This change wasn’t considered a priority however. Only after the LastPass breach things started moving, and it wasn’t Bitwarden’s core developers driving the change. Someone contributed the changes required for scrypt support and Argon2 support. The former was rejected in favor of the latter, and Argon2 will hopefully become the default (only?) choice at some point in future.
Adding a secret key like 1Password would have been another option to address this issue. This suggestion has also been around since at least 2018 and accumulated a considerable amount of votes, but so far it hasn’t been implemented either.
On the bright side, Bitwarden clearly states that they encrypt all your vault data, including website addresses. So unlike with LastPass, any data lifted from Bitwarden servers will in fact be useless until the attackers manage to decrypt it.
How server-side iterations could have been designedIn case you are wondering whether it is even possible to implement server-side iterations mechanism correctly: yes, it is. One example is the onepw protocol Mozilla introduced for Firefox Sync in 2014. While the description is fairly complicated, the important part is: the password hash received by the server is not used for anything before it passes through additional scrypt hashing.
Firefox Sync has a different flaw: its client-side password hashing uses merely 1,000 PBKDF2 iterations, a ridiculously low setting. So if someone compromises the production servers rather than merely the stored data, they will be able to intercept password hashes that are barely protected. The corresponding bug report has been open for the past six years and is still unresolved.
The same attack scenario is an issue for Bitwarden as well. Even if you configure your account with 1,000,000 iterations, a compromised Bitwarden server can always tell the client to apply merely 5,000 PBKDF2 iterations to the master password before sending it to the server. The client has to rely on the server to tell it the correct value, and as long as low settings like 5,000 iterations are supported this issue will remain.
Niko Matsakis: Rust in 2023: Growing up
When I started working on Rust in 2011, my daughter was about three months old. She’s now in sixth grade, and she’s started growing rapidly. Sometimes we wake up to find that her clothes don’t quite fit anymore: the sleeves might be a little too short, or the legs come up to her ankles. Rust is experiencing something similar. We’ve been growing tremendously fast over the last few years, and any time you experience growth like that, there are bound to be a few rough patches. Things that don’t work as well as they used to. This holds both in a technical sense — there are parts of the language that don’t seem to scale up to Rust’s current size — and in a social one — some aspects of how the projects runs need to change if we’re going to keep growing the way I think we should. As we head into 2023, with two years to go until the Rust 2024 edition, this is the theme I see for Rust: maturation and scaling.
TL;DRIn summary, these are (some of) the things I think are most important for Rust in 2023:
- Implementing “the year of everywhere” so that you can make any function async, write impl Trait just about anywhere, and fully utilize generic associated types; planning for the Rust 2024 edition.
- Beginning work on a Rust specification and integrating it into our processes.
- Defining rules for unsafe code and smooth tooling to check whether you’re following them.
- Supporting efforts to teach Rust in universities and elsewhere.
- Improving our product planning and user feedback processes.
- Refining our governance structure with specialized teams for dedicated areas, more scalable structure for broad oversight, and more intensional onboarding.
What do async-await, impl Trait, and generic parameters have in common? They’re all essential parts of modern Rust, that’s one thing. They’re also all, in my opinion, in a “minimum viable product” state. Each of them has some key limitations that make them less useful and more confusing than they have to be. As I wrote in “Rust 2024: The Year of Everywhere”, there are currently a lot of folks working hard to lift those limitations through a number of extensions:
- Generic associated types (stabilized in October, now undergoing various improvements!)
- Type alias impl trait (proposed for stabilization)
- Async functions in traits and “return position impl Trait in traits” (static dispatch available on nightly, but more work is needed)
- Polonius (under active discussion)
None of these features are “new”. They just take something that exists in Rust and let you use it more broadly. Nonetheless, I think they’re going to have a big impact, on experienced and new users alike. Experienced users can express more patterns more easily and avoid awkward workarounds. New users never have to experience the confusion that comes from typing something that feels like it should work, but doesn’t.
One other important point: Rust 2024 is just around the corner! Our goal is to get any edition changes landed on master this year, so that we can spend the next year doing finishing touches. This means we need to put in some effort to thinking ahead and planning what we can achieve.
Towards a Rust specificationAs Rust grows, there is increasing need for a specification. Mara had a recent blog post outlining some of the considerations — and especially the distinction between a specification and standardization. I don’t see the need for Rust to get involved in any standards bodies — our existing RFC and open-source process works well. But I do think that for us to continue growing out the set of people working on Rust, we need a central definition of what Rust should do, and that we need to integrate that definition into our processes more thoroughly.
In addition to long-standing docs like the Rust Reference, the last year has seen a number of notable efforts towards a Rust specification. The Ferrocene language specification is the most comprehensive, covering the grammar, name resolution, and overall functioning of the compiler. Separately, I’ve been working on a project called a-mir-formality, which aims to be a “formal model” of Rust’s type system, including the borrow checker. And Ralf Jung has MiniRust, which is targeting the rules for unsafe code.
So what would an official Rust specification look like? Mara opened RFC 3355, which lays out some basic parameters. I think there are still a lot of questions to work out. Most obviously, how can we combine the existing efforts and documents? Each of them has a different focus and — as a result — a somewhat different structure. I’m hopeful that we can create a complementary whole.
Another important question is how to integrate the specification into our project processes. We’ve already got a rule that new language features can’t be stabilized until the reference is updated, but we’ve not always followed it, and the lang docs team is always in need of support. There are hopeful signs here: both the Foundation and Ferrocene are interested in supporting this effort.
Unsafe codeIn my experience, most production users of Rust don’t touch unsafe code, which is as it should be. But almost every user of Rust relies on dependencies that do, and those dependencies are often the most critical systems.
At first, the idea of unsafe code seems simple. By writing unsafe, you gain access to new capabilities, but you take responsibility for using them correctly. But the more you look at unsafe code, the more questions come up. What does it mean to use those capabilities correctly? These questions are not just academic, they have a real impact on optimizations performed by the Rust compiler, LLVM, and even the hardware.
Eventually, we want to get to a place where those who author unsafe code have clear rules to follow, as well as simple tooling to test if their code violates those rules (think cargo test —unsafe). Authors who want more assurance than dynamic testing can provide should have access to static verifiers that can prove their crate is safe — and we should start by proving the standard library is safe.
We’ve been trying for some years to build that world but it’s been ridiculously hard. Lately, though, there have been some breakthroughs. Gankra’s experiments with strict_provenance APIs have given some hope that we can define a relatively simple provenance model that will support both arbitrary unsafe code trickery and aggressive optimization, and Ralf Jung’s aforementioned MiniRust shows how a Rust operational semantics could look. More and more crates test with miri to check their unsafe code, and for those who wish to go further, the kani verifier can check unsafe code for UB (more formal methods tooling here).
I think we need a renewed focus on unsafe code in 2023. The first step is already underway: we are creating the opsem team. Led by Ralf Jung and Jakob Degen, the opsem team has the job of defining “the rules governing unsafe code in Rust”. It’s been clear for some time that this area requires dedicated focus, and I am hopeful that the opsem team will help to provide that.
I would like to see progress on dynamic verification. In particular, I think we need a tool that can handle arbitrary binaries. miri is great, but it can’t be used to test programs that call into C code. I’d like to see something more like valgrind or ubsan, where you can test your Rust project for UB even if it’s calling into other languages through FFI.
Dynamic verification is great, but it is limited by the scope of your tests. To get true reliability, we need a way for unsafe code authors to do static verification. Building static verification tools today is possible but extremely painful. The compiler’s APIs are unstable and a moving target. The stable MIR project proposes to change that by providing a stable set of APIs that tool authors can build on.
Finally, the best unsafe code is the unsafe code you don’t have to write. Unsafe code provides infinite power, but people often have simpler needs that could be made safe with enough effort. Projects like [cxx][] demonstrate the power of this approach. For Rust the language, safe transmute is the most promising such effort, and I’d like to see more of that.
Teaching Rust in universitiesMore and more universities are offering classes that make use of Rust, and recently many of these educators have come together in the Rust Edu initiative to form shared teaching materials. I think this is great, and a trend we should encourage. It’s helpful for the Rust community, of course, since it means more Rust programmers. I think it’s also helpful for the students: much like learning a functional programming language, learning Rust requires incorporating different patterns and structure than other languages. I find my programs tend to be broken into smaller pieces, and the borrow checker forces me to be more thoughtful about which bits of context each function will need. Even if you wind up building your code in other languages, those new patterns will influence the way you work.
Stronger connections to teacher can also be a great source of data for improving Rust. If we understand better how people learn Rust and what they find difficult, we can use that to guide our priorities and look for ways to make it better. This might mean changing the language, but it might also mean changing the tooling or error messages. I’d like to see us setup some mechanism to feed insights from Rust educators, both in universities but also trainers at companies like Ferrous Systems or Integer32, into the Rust teams.
One particularly exciting effort here is the research being done at Brown University1 by Will Crichton and Shriram Krisnamurthi. Will and Shriram have published an interactive version of the Rust book that includes quizzes. As a reader, these quizzes help you check that you understood the section. But they also provide feedback to the book authors on which sections are effective. And they allow for “A/B testing”, where you change the content of the book and see whether the quiz scores improve. Will and Shriram are also looking at other ways to deepen our understanding of how people learn Rust.
More insight and data into the user experienceAs Rust has grown, we no longer have the obvious gaps in our user experience that there used to be (e.g., “no IDE support”). At the same time, it’s clear that the experience of Rust developers could be a lot smoother. There are a lot of great ideas of changes to make, but it’s hard to know which ones would be most effective. I would like to see a more coordinated effort to gather data on the user experience and transform it into actionable insights. Currently, the largest source of data that we have is the annual Rust survey. This is a great resource, but it only gives a very broad picture of what’s going on.
A few years back, the async working group collected “status quo” stories as part of its vision doc effort. These stories were immensely helpful in understanding the “async Rust user experience”, and they are still helping to shape the priorities of the async working group today. At the same time, that was a one-time effort, and it was focused on async specifically. I think that kind of effort could be useful in a number of areas.
I’ve already mentioned that teachers can provide one source of data. Another is simply going out and having conversations with Rust users. But I think we also need fine-grained data about the user experience. In the compiler team’s mid-year report, they noted (emphasis mine):
One more thing I want to point out: five of the ambitions checked the box in the survey that said “some of our work has reached Rust programmers, but we do not know if it has improved Rust for them.”
Right now, it’s really hard to know even basic things, like how many users are encountering compiler bugs in the wild. We have to judge that by how many comments people leave on a Github issue. Meanwhile, Esteban personally scours twitter to find out which error messages are confusing to people.2 We should look into better ways to gather data here. I’m a fan of (opt-in, privacy preserving) telemetry, but I think there’s a discussion to be had here about the best approach. All I know is that there has to be a better way.
Maturing our governanceIn 2015, shortly after 1.0, RFC 1068 introduced the original Rust teams: libs, lang, compiler, infra, and moderation. Each team is an independent, decision-making entity, owning one particular aspect of Rust, and operating by consensus. The “Rust core team” was given the role of knitting them together and providing a unifying vision. This structure has been a great success, but as we’ve grown, it has started to hit some limits.
The first limiting point has been bringing the teams together. The original vision was that team leads—along with others—would be part of a core team that would provide a unifying technical vision and tend to the health of the project. It’s become clear over time though that there are really different jobs. Over this year, the various Rust teams, project directors, and existing core team have come together to define a new model for project-wide governance. This effort is being driven by a dedicated working group and I am looking forward to seeing that effort come to fruition this year.
The second limiting point has been the need for more specialized teams. One example near and dear to my heart is the new types team, which is focused on type and trait system. This team has the job of diving into the nitty gritty on proposals like Generic Associated Types or impl Trait, and then surfacing up the key details for broader-based teams like lang or compiler where necessary. The aforementioned opsem team is another example of this sort of team. I suspect we’ll be seeing more teams like this.
There continues to be a need for us to grow teams that do more than coding. The compiler team prioritization effort, under the leadership of apiraino, is a great example of a vital role that allows Rust to function but doesn’t involve landing PRs. I think there are a number of other “multiplier”-type efforts that we could use. One example would be “reporters”, i.e., people to help publish blog posts about the many things going on and spread information around the project. I am hopeful that as we get a new structure for top-level governance we can see some renewed focus and experimentation here.
ConclusionSeven years since Rust 1.0 and we are still going strong. As Rust usage spreads, our focus is changing. Where once we had gaping holes to close, it’s now more a question of iterating to build on our success. But the more things change, the more they stay the same. Rust is still working to empower people to build reliable, performant programs. We still believe that building a supportive, productive tool for systems programming — one that brings more people into the “systems programming” tent — is also the best way to help the existing C and C++ programmers “hack without fear” and build the kind of systems they always wanted to build. So, what are you waiting for? Let’s get building!
The Rust Programming Language Blog: Officially announcing the types team
Oh hey, it's another new team announcement. But I will admit: if you follow the RFCs repository, the Rust zulip, or were particularly observant on the GATs stabilization announcement post, then this might not be a surprise for you. In fact, this "new" team was officially established at the end of May last year.
There are a few reasons why we're sharing this post now (as opposed to months before or...never). First, the team finished a three day in-person/hybrid meetup at the beginning of December and we'd like to share the purpose and outcomes of that meeting. Second, posting this announcement now is just around 7 months of activity and we'd love to share what we've accomplished within this time. Lastly, as we enter into the new year of 2023, it's a great time to share a bit of where we expect to head in this year and beyond.
Background - How did we get here?Rust has grown significantly in the last several years, in many metrics: users, contributors, features, tooling, documentation, and more. As it has grown, the list of things people want to do with it has grown just as quickly. On top of powerful and ergonomic features, the demand for powerful tools such as IDEs or learning tools for the language has become more and more apparent. New compilers (frontend and backend) are being written. And, to top it off, we want Rust to continue to maintain one of its core design principles: safety.
All of these points highlights some key needs: to be able to know how the Rust language should work, to be able to extend the language and compiler with new features in a relatively painless way, to be able to hook into the compiler and be able to query important information about programs, and finally to be able to maintain the language and compiler in an amenable and robust way. Over the years, considerable effort has been put into these needs, but we haven't quite achieved these key requirements.
To extend a little, and put some numbers to paper, there are currently around 220 open tracking issues for language, compiler, or types features that have been accepted but are not completely implemented, of which about half are at least 3 years old and many are several years older than that. Many of these tracking issues have been open for so long not solely because of bandwidth, but because working on these features is hard, in large part because putting the relevant semantics in context of the larger language properly is hard; it's not easy for anyone to take a look at them and know what needs to be done to finish them. It's clear that we still need better foundations for making changes to the language and compiler.
Another number that might shock you: there are currently 62 open unsoundness issues. This sounds much scarier than it really is: nearly all of these are edges of the compiler and language that have been found by people who specifically poke and prod to find them; in practice these will not pop up in the programs you write. Nevertheless, these are edges we want to iron out.
The Types TeamMoving forward, let's talk about a smaller subset of Rust rather than the entire language and compiler. Specifically, the parts relevant here include the type checker - loosely, defining the semantics and implementation of how variables are assigned their type, trait solving - deciding what traits are defined for which types, and borrow checking - proving that Rust's ownership model always holds. All of these can be thought of cohesively as the "type system".
As of RFC 3254, the above subset of the Rust language and compiler are under the purview of the types team. So, what exactly does this entail?
First, since around 2018, there existed the "traits working group", which had the primary goal of creating a performant and extensible definition and implementation of Rust's trait system (including the Chalk trait-solving library). As time progressed, and particularly in the latter half of 2021 into 2022, the working group's influence and responsibility naturally expanded to the type checker and borrow checker too - they are actually strongly linked and its often hard to disentangle the trait solver from the other two. So, in some ways, the types team essentially subsumes the former traits working group.
Another relevant working group is the polonius working group, which primarily works on the design and implementation of the Polonius borrow-checking library. While the working group itself will remain, it is now also under the purview of the types team.
Now, although the traits working group was essentially folded into the types team, the creation of a team has some benefits. First, like the style team (and many other teams), the types team is not a top level team. It actually, currently uniquely, has two parent teams: the lang and compiler teams. Both teams have decided to delegate decision-making authority covering the type system.
The language team has delegated the part of the design of type system. However, importantly, this design covers less of the "feel" of the features of type system and more of how it "works", with the expectation that the types team will advise and bring concerns about new language extensions where required. (This division is not strongly defined, but the expectation is generally to err on the side of more caution). The compiler team, on the other hand, has delegated the responsibility of defining and maintaining the implementation of the trait system.
One particular responsibility that has traditionally been shared between the language and compiler teams is the assessment and fixing of soundness bugs in the language related to the type system. These often arise from implementation-defined language semantics and have in the past required synchronization and input from both lang and compiler teams. In the majority of cases, the types team now has the authority to assess and implement fixes without the direct input from either parent team. This applies, importantly, for fixes that are technically backwards-incompatible. While fixing safety holes is not covered under Rust's backwards compatibility guarantees, these decisions are not taken lightly and generally require team signoff and are assessed for potential ecosystem breakage with crater. However, this can now be done under one team rather than requiring the coordination of two separate teams, which makes closing these soundness holes easier (I will discuss this more later.)
Formalizing the Rust type systemAs mentioned above, a nearly essential element of the growing Rust language is to know how it should work (and to have this well documented). There are relatively recent efforts pushing for a Rust specification (like Ferrocene or this open RFC), but it would be hugely beneficial to have a formalized definition of the type system, regardless of its potential integration into a more general specification. In fact the existence of a formalization would allow a better assessment of potential new features or soundness holes, without the subtle intricacies of the rest of the compiler.
As far back as 2015, not long after the release of Rust 1.0, an experimental Rust trait solver called Chalk began to be written. The core idea of Chalk is to translate the surface syntax and ideas of the Rust trait system (e.g. traits, impls, where clauses) into a set of logic rules that can be solved using a Prolog-like solver. Then, once this set of logic and solving reaches parity with the trait solver within the compiler itself, the plan was to simply replace the existing solver. In the meantime (and continuing forward), this new solver could be used by other tools, such as rust-analyzer, where it is used today.
Now, given Chalk's age and the promises it had been hoped to be able to deliver on, you might be tempted to ask the question "Chalk, when?" - and plenty have. However, we've learned over the years that Chalk is likely not the correct long-term solution for Rust, for a few reasons. First, as mentioned a few times in this post, the trait solver is only but a part of a larger type system; and modeling how the entire type system fits together gives a more complete picture of its details than trying to model the parts separately. Second, the needs of the compiler are quite different than the needs of a formalization: the compiler needs performant code with the ability to track information required for powerful diagnostics; a good formalization is one that is not only complete, but also easy to maintain, read, and understand. Over the years, Chalk has tried to have both and it has so far ended up with neither.
So, what are the plans going forward? Well, first the types team has begun working on a formalization of the Rust typesystem, currently coined a-mir-formality. An initial experimental phase was written using PLT redex, but a Rust port is in-progress. There's lot to do still (including modeling more of the trait system, writing an RFC, and moving it into the rust-lang org), but it's already showing great promise.
Second, we've begun an initiative for writing a new trait solver in-tree. This new trait solver is more limited in scope than a-mir-formality (i.e. not intending to encompass the entire type system). In many ways, it's expected to be quite similar to Chalk, but leverage bits and pieces of the existing compiler and trait solver in order to make the transition as painless as possible. We do expect it to be pulled out-of-tree at some point, so it's being written to be as modular as possible. During our types team meetup earlier this month, we were able to hash out what we expect the structure of the solver to look like, and we've already gotten that merged into the source tree.
Finally, Chalk is no longer going to be a focus of the team. In the short term, it still may remain a useful tool for experimentation. As said before, rust-analyzer uses Chalk as its trait solver. It's also able to be used in rustc under an unstable feature flag. Thus, new ideas currently could be implemented in Chalk and battle-tested in practice. However, this benefit will likely not last long as a-mir-formality and the new in-tree trait solver get more usable and their interfaces become more accessible. All this is not to say that Chalk has been a failure. In fact, Chalk has taught us a lot about how to think about the Rust trait solver in a logical way and the current Rust trait solver has evolved over time to more closely model Chalk, even if incompletely. We expect to still support Chalk in some capacity for the time being, for rust-analyzer and potentially for those interested in experimenting with it.
Closing soundness holesAs brought up previously, a big benefit of creating a new types team with delegated authority from both the lang and compiler teams is the authority to assess and fix unsoundness issues mostly independently. However, a secondary benefit has actually just been better procedures and knowledge-sharing that allows the members of the team to get on the same page for what soundness issues there are, why they exist, and what it takes to fix them. For example, during our meetup earlier this month, we were able to go through the full list of soundness issues (focusing on those relevant to the type system), identify their causes, and discuss expected fixes (though most require prerequisite work discussed in the previous section).
Additionally, the team has already made a number of soundness fixes and has a few more in-progress. I won't go into details, but instead am just opting to putting them in list form:
- Consider unnormalized types for implied bounds: landed in 1.65, no regressions found
- Neither require nor imply lifetime bounds on opaque type for well formedness: landed in 1.66, no regressions found
- Add IMPLIED_BOUNDS_ENTAILMENT lint: landing in 1.68, future-compat lint because many regressions found (of unsoundness)
- Check ADT fields for copy implementations considering regions: currently open, ready to land
- Register wf obligation before normalizing in wfcheck: currently open, regressions found, needs additional work
- Handle projections as uncovered types during coherence check: currently open, some regressions found, future-compat lint suggested
- Don't normalize in AstConv: landing in 1.68, 1 small regression found
As you can see, we're making progress on closing soundness holes. These sometimes break code, as assessed by crater. However, we do what we can to mitigate this, even when the code being broken is technically unsound.
New featuresWhile it's not technically under the types team purview to propose and design new features (these fall more under lang team proper), there are a few instances where the team is heavily involved (if not driving) feature design.
These can be small additions, which are close to bug fixes. For example, this PR allows more permutations of lifetime outlives bounds than what compiled previously. Or, these PRs can be larger, more impactful changes, that don't fit under a "feature", but instead are tied heavily to the type system. For example, this PR makes the Sized trait coinductive, which effectively makes more cyclic bounds compile (see this test for an example).
There are also a few larger features and feature sets that have been driven by the types team, largely due to the heavy intersection with the type system. Here are a few examples:
- Generic associated types (GATs) - The feature long predates the types team and is the only one in this list that has actually been stabilized so far. But due to heavy type system interaction, the team was able to navigate the issues that came on its final path to stabilization. See this blog post for much more details.
- Type alias impl trait (TAITs) - Implementing this feature properly requires a thorough understanding of the type checker. This is close to stabilization. For more information, see the tracking issue.
- Trait upcasting - This one is relatively small, but has some type system interaction. Again, see the tracking issue for an explanation of the feature.
- Negative impls - This too predates the types team, but has recently been worked on by the team. There are still open bugs and soundness issues, so this is a bit away from stabilization, but you can follow here.
- Return position impl traits in traits (RPITITs) and async functions in traits (AFITs) - These have only recently been possible with advances made with GATs and TAITs. They are currently tracked under a single tracking issue.
To conclude, let's put all of this onto a roadmap. As always, goals are best when they are specific, measurable, and time-bound. For this, we've decided to split our goals into roughly 4 stages: summer of 2023, end-of-year 2023, end-of-year 2024, and end-of-year 2027 (6 months, 1 year, 2 years, and 5 years). Overall, our goals are to build a platform to maintain a sound, testable, and documented type system that can scale to new features need by the Rust language. Furthermore, we want to cultivate a sustainable and open-source team (the types team) to maintain that platform and type system.
A quick note: some of the things here have not quite been explained in this post, but they've been included in the spirit of completeness. So, without further ado:
6 months
- The work-in-progress new trait solver should be testable
- a-mir-formality should be testable against the Rust test suite
- Both TAITs and RPITITs/AFITs should be stabilized or on the path to stabilization.
EOY 2023
- New trait solver replaces part of existing trait solver, but not used everywhere
- We have an onboarding plan (for the team) and documentation for the new trait solver
- a-mir-formality is integrated into the language design process
EOY 2024
- New trait solver shared by rustc and rust-analyzer
- Milestone: Type IR shared
- We have a clean API for extensible trait errors that is available at least internally
- "Shiny features"
- Polonius in a usable state
- Implied bounds in higher-ranked trait bounds (see this issue for an example of an issue this would fix)
- Being able to use impl Trait basically anywhere
- Potential edition boundary changes
EOY 2027
- (Types) unsound issues resolved
- Most language extensions are easy to do; large extensions are feasible
- a-mir-formality passes 99.9% of the Rust test suite
It's an exciting time for Rust. As its userbase and popularity grows, the language does as well. And as the language grows, the need for a sustainable type system to support the language becomes ever more apparent. The project has formed this new types team to address this need and hopefully, in this post, you can see that the team has so far accomplished a lot. And we expect that trend to only continue over the next many years.
As always, if you'd like to get involved or have questions, please drop by the Rust zulip.
Will Kahn-Greene: Socorro: Schema based overhaul of crash ingestion: retrospective (2022)
- time:
2+ years
- impact:
radically reduced risk of data leaks due to misconfigured permissions
centralized and simplified configuration and management of fields
normalization and validation performed during processing
documentation of data reviews, data caveats, etc
reduced risk of bugs when adding new fields--testing is done in CI
new crash reporting data dictionary with Markdown-formatted descriptions, real examples, relevant links
I've been working on Socorro (crash ingestion pipeline at Mozilla) since the beginning of 2016. During that time, I've focused on streamlining maintainence of the project, paying down technical debt, reducing risk, and improving crash analysis tooling.
One of the things I identified early on is how the crash ingestion pipeline was chaotic, difficult to reason about, and difficult to document. What did the incoming data look like? What did the processed data look like? Was it valid? Which fields were protected? Which fields were public? How do we add support for a new crash annotation? This was problematic for our ops staff, engineering staff, and all the people who used Socorro. It was something in the back of my mind for a while, but I didn't have any good thoughts.
In 2020, Socorro moved into the Data Org which has multiple data pipelines. After spending some time looking at how their pipelines work, I wanted to rework crash ingestion.
The end result of this project is that:
the project is easier to maintain:
adding support for new crash annotations is done in a couple of schema files and possibly a processor rule
risk of security issues and data breaches is lower:
typos, bugs, and mistakes when adding support for a new crash annotation are caught in CI
permissions are specified in a central location, changing permission for fields is trivial and takes effect in the next deploy, setting permissions supports complex data structures in easy-to-reason-about ways, and mistakes are caught in CI
the data is easier to use and reason about:
normalization and validation of crash annotation data happens during processing and downstream uses of the data can expect it to be valid; further we get a signal when the data isn't valid which can indicate product bugs
schemas describing incoming and processed data
crash reporting data dictionary documenting incoming data fields, processed data fields, descriptions, sources, data gotchas, examples, and permissions
Socorro is the crash ingestion pipeline for Mozilla products like Firefox, Fenix, Thunderbird, and MozillaVPN.
When Firefox crashes, the crash reporter asks the user if the user would like to send a crash report. If the user answers "yes!", then the crash reporter collects data related to the crash, generates a crash report, and submits that crash report as an HTTP POST to Socorro. Socorro saves the submitted crash report, processes it, and has tools for viewing and analyzing crash data.
State of crash ingestion at the beginningThe crash ingestion system was working and it was usable, but it was in a bad state.
Poor data management
Normalization and validation of data was all over the codebase and not consistent:
processor rule code
AWS S3 crash storage code
Elasticsearch indexing code
Telemetry crash storage code
Super Search querying and result rendering code
report view and template code
signature report code and template code
crontabber job code
any scripts that used the data
tests -- many of which had bad test data so who knows what they were really testing
Naive handling of minidump stackwalker output which meant that any changes in the stackwalker output were predominantly unnoticed and there was no indication as to whether changed output created issues in the system.
Further, since it was all over the place, there were no guarantees for data validity when downloading it using the RawCrash, ProcessedCrash, and SuperSearch APIs. Anyone writing downstream systems would also have to normalize and validate the data.
Poor permissions management
Permissions were defined in multiple places:
Elasticsearch json redactor
Super Search fields
RawCrash API allow list
ProcessedCrash API allow list
report view and template code
Telemetry crash storage code
and other places
We couldn't effectively manage permissions of fields in the stackwalker output because we had no idea what was there.
Poor documentation
No documentation of crash annotation fields other than CrashAnnotations.yaml which didn't enforce anything in crash ingestion (process, valid type, data correctness, etc) and was missing important information like data gotchas, data review urls, and examples.
No documentation of processed crash fields at all.
Making changes was high risk
Changing fields from public to protected was high risk because you had to find all the places it might show up which was intractable. Adding support for new fields often took multiple passes over several weeks because we'd miss things. Server errors happend with some regularity due to weirdness with crash annotation values affecting the Crash Stats site.
Tangled concerns across the codebase
Lots of tangled concerns where things defined in one place affected other places that shouldn't be related. For example, the Super Search fields definition was acting as a "schema" for other parts of the system that had nothing to do with Elasticsearch or Super Search.
Difficult to maintain
It was difficult to support new products.
It was difficult to debug issues in crash ingestion and crash reporting.
The Crash Stats webapp contained lots of if/then/else bits to handle weirdness in the crash annotation values. Nulls, incorrect types, different structures, etc.
Socorro contained lots of vestigial code from half-done field removal, deprecated fields, fields that were removed from crash reports, etc. These vestigial bits were all over the code base. Discovering and removing these bits was time consuming and error prone.
The code for exporting data to Telemetry built the export data using a list of fields to exclude rather than a list of fields to include. This is backwards and impossible to maintain--we never should have been doing this. Further, it pulled data from the raw crash which we had no validation guarantees for which would cause issues downstream in the Telemetry import code.
There was no way to validate the data used in the unit tests which meant that a lot of it was invalid. We had no way to validate the test data which meant that CI would pass, but we'd see errors in our stage and production environments.
Different from other similar systems
In 2020, Socorro was moved to the Data Org in Mozilla which had a set of standards and conventions for collecting, storing, analyzing, and providing access to data. Socorro didn't follow any of it which made it difficult to work on, to connect with, and to staff. Things Data Org has that Socorro didn't:
a schema covering specifying fields, types, and documentation
data flow documentation
data review policy, process, and artifacts for data being collected and how to add new data
data dictionary for fields for users including documentation, data review urls, data gotchas
In summary, we had a system that took a lot of effort to maintain, wasn't serving our users' needs, and was high risk of security/data breach.
Project planMany of these issues can be alleviated and reduced by moving to a schema-driven system where we:
define a schema for annotations and a schema for the processed crash
change crash ingestion and the Crash Stats site to use those schemas
When designing this schema-driven system, we should be thinking about:
how easy is it to maintain the system?
how easy is it to explain?
how flexible is it for solving other kinds of problems in the future?
what kinds of errors will likely happen when maintaining the system and how can we avert them in CI?
what kinds of errors can happen and how much risk do they pose for data leaks? what of those can we avert in CI?
how flexible is the system which needs to support multiple products potentially with different needs?
I worked out a minimal version of that vision that we could migrate to and then work with going forward.
The crash annotations schema should define:
what annotations are in the crash report?
which permissions are required to view a field
field documentation (provenance, description, data review, related bugs, gotchas, analysis tips, etc)
The processed crash schema should define:
what's in the processed crash?
which permissions are required to view a field
field documentation (provenance, description, related bugs, gotchas, analysis tips, etc)
Then we make the following changes to the system:
write a processor rule to copy, nomralize, and validate data from the raw crash based on the processed crash schema
switch the Telemetry export code to using the processed crash for data to export
switch the Telemetry export code to using the processed crash schema for permissions
switch Super Search to using the processed crash for data to index
switch Super Search to using the processed crash schema for documentation and permissions
switch Crash Stats site to using the processed crash for data to render
switch Crash Stats site to using the processed crash schema for documentation and permissions
switch the RawCrash, ProcessedCrash, and SuperSearch APIs to using the crash annotations and processed crash schemas for documentation and permissions
After doing that, we have:
field documentation is managed in the schemas
permissions are managed in the schemas
data is normalized and validated once in the processor and everything uses the processed crash data for indexing, searching, and rendering
adding support for new fields and changing existing fields is easier and problems are caught in CI
Use JSON Schema.
Data Org at Mozilla uses JSON Schema for schema specification. The schema is written using YAML.
https://mozilla.github.io/glean_parser/metrics-yaml.html
The metrics schema is used to define metrics.yaml files which specify the metrics being emitted and collected.
For example:
https://searchfox.org/mozilla-central/source/toolkit/mozapps/update/metrics.yaml
One long long long term goal for Socorro is to unify standards and practices with the Data Ingestion system. Towards that goal, it's prudent to build out a crash annotation and processed crash schemas using whatever we can take from the equivalent metrics schemas.
We'll additionally need to build out tooling for verifying, validating, and testing schema modifications to make ongoing maintenance easier.
Use schemas to define and drive everything.
We've got permissions, structures, normalization, validation, definition, documentation, and several other things related to the data and how it's used throughout crash ingestion spread out across the codebase.
Instead of that, let's pull it all together into a single schema and change the system to be driven from this schema.
The schema will include:
structure specification
documentation including data gotchas, examples, and implementation details
permissions
processing instructions
We'll have a schema for supported annotations and a schema for the processed crash.
We'll rewrite existing parts of crash ingestion to use the schema:
processing 1. use processing instructions to validate and normalize annotation data
super search 1. field documentation 2. permissions 3. remove all the normalization and validation code from indexing
crash stats 1. field documentation 2. permissions 3. remove all the normalization and validation code from page rendering
Only use processed crash data for indexing and analysis.
The indexing system has its own normalization and validation code since it pulls data to be indexed from the raw crash.
The crash stats code has its own normalization and validation code since it renders data from the raw crash in various parts of the site.
We're going to change this so that all normalization and validation happens during processing, the results are stored in the processed crash, and indexing, searching, and crash analysis only work on processed crash data.
By default, all data is protected.
By default, all data is protected unless it is explicitly marked as public. This has some consequences for the code:
any data not specified in a schema is treated as protected
all schema fields need to specify permissions for that field
any data in a schema is either: * marked public, OR * lists the permissions required to view that data
for nested structures, any child field that is public has public ancesters
We can catch some of these issues in CI and need to write tests to verify them.
This is slightly awkward when maintaining the schema because it would be more reasonable to have "no permissions required" mean that the field is public. However, it's possible to accidentally not specify the permissions and we don't want to be in that situation. Thus, we decided to go with explicitly marking public fields as public.
Work done Phase 1: cleaning upWe had a lot of work to do before we could start defining schemas and changing the system to use those schemas.
remove vestigial code (some of this work was done in other phases as it was discovered)
[bug 1724933]: remove unused/obsolete annotations (2021-08)
[bug 1743487]: remove total_frames (2021-11)
[bug 1743704]: remove jit crash classifier (2022-02)
[bug 1762000]: remove vestigial Winsock_LSP code (2022-03)
[bug 1784485]: remove vestigial exploitability code (2022-08)
[bug 1784095]: remove vestigial contains_memory_report code (2022-08)
[bug 1787933]: exorcise flash things from the codebase (2022-09)
fix signature generation
[bug 1753521]: use fields from processed crash (2022-02)
[bug 1755523]: fix signature generation so it only uses processed crash data (2022-02)
[bug 1762207]: remove hang_type (2022-04)
fix Super Search
[bug 1624345]: stop saving random data to Elasticsearch crashstorage (2020-06)
[bug 1706076]: remove dead Super Search fields (2021-04)
[bug 1712055]: remove system_error from Super Search fields (2021-07)
[bug 1712085]: remove obsolete Super Search fields (2021-08)
[bug 1697051]: add crash_report_keys field (2021-11)
[bug 1736928]: remove largest_free_vm_block and tiny_block_size (2021-11)
[bug 1754874]: remove unused annotations from Super Search (2022-02)
[bug 1753521]: stop indexing items from raw crash (2022-02)
[bug 1762005]: migrate to lower-cased versions of Plugin* fields in processed crash (2022-03)
[bug 1755528]: fix flag/boolean handling (2022-03)
[bug 1762207]: remove hang_type (2022-04)
[bug 1763264]: clean up super search fields from migration (2022-07)
fix data flow and usage
[bug 1740397]: rewrite CrashingThreadInfoRule to normalize crashing thread (2021-11)
[bug 1755095]: fix TelemetryBotoS3CrashStorage so it doesn't use Super Search fields (2022-03)
[bug 1740397]: change webapp to pull crashing_thread from processed crash (2022-07)
[bug 1710725]: stop using DotDict for raw and processed data (2022-09)
clean up the raw crash structure
[bug 1687987]: restructure raw crash (2021-01 through 2022-10)
After cleaning up the code base, removing vestigial code, fixing Super Search, and fixing Telemetry export code, we could move on to defining schemas and writing all the code we needed to maintain the schemas and work with them.
[bug 1762271]: rewrite json schema reducer (2022-03)
[bug 1764395]: schema for processed crash, reducers, traversers (2022-08)
[bug 1788533]: fix validate_processed_crash to handle pattern_properties (2022-08)
[bug 1626698]: schema for crash annotations in crash reports (2022-11)
That allowed us to fix a bunch of things:
[bug 1784927]: remove elasticsearch redactor code (2022-08)
[bug 1746630]: support new threads.N.frames.N.unloaded_modules minidump-stackwalk fields (2022-08)
[bug 1697001]: get rid of UnredactedCrash API and model (2022-08)
[bug 1100352]: remove hard-coded allow lists from RawCrash (2022-08)
[bug 1787929]: rewrite Breadcrumbs validation (2022-09)
[bug 1787931]: fix Super Search fields to pull permissions from processed crash schema (2022-09)
[bug 1787937]: fix Super Search fields to pull documentation from processed crash schema (2022-09)
[bug 1787931]: use processed crash schema permissions for super search (2022-09)
[bug 1100352]: remove hard-coded allow lists from ProcessedCrash models (2022-11)
[bug 1792255]: add telemetry_environment to processed crash (2022-11)
[bug 1784558]: add collector metadata to processed crash (2022-11)
[bug 1787932]: add data review urls for crash annotations that have data reviews (2022-11)
With fields specified in schemas, we can write a crash reporting data dictionary:
[bug 1803558]: crash reporting data dictionary (2023-01)
[bug 1795700]: document raw and processed schemas and how to maintain them (2023-01)
Then we can finish:
[bug 1677143]: documenting analysis gotchas (ongoing)
[bug 1755525]: fixing the report view to only use the processed crash (future)
[bug 1795699]: validate test data (future)
This was a very very long-term project with many small steps and some really big ones. Getting large projects done is futile and the only way to do it successfully is to break it into a million small steps each of which stand on their own and don't create urgency for getting the next step done.
Any time I changed field names or types, I'd have to do a data migration. Data migrations take 6 months to do because I have to wait for existing data to expire from storage. On the one hand, it's a blessing I could do migrations at all--you can't do this with larger data sets or with data sets where the data doesn't expire without each migration becoming a huge project. On the other hand, it's hard to juggle being in the middle of multiple migrations and sometimes the contortions one has to perform are grueling.
If you're working on a big project that's going to require changing data structures, figure out how to do migrations early with as little work as possible and use that process as often as you can.
Conclusion and where we could go from hereThis was such a huge project that spanned years. It's so hard to finish projects like this because the landscape for the project is constantly changing. Meanwhile, being mid-project has its own set of complexities and hardships.
I'm glad I tackled it and I'm glad it's mostly done. There are some minor things to do, still, but this new schema-driven system has a lot going for it. Adding support for new crash annotations is much easier, less risky, and takes less time.
It took me about a month to pull this post together.
That's it!That's the story of the schema-based overhaul of crash ingestion. There's probably some bits missing and/or wrong, but the gist of it is here.
If you have any questions or bump into bugs, I hang out on #crashreporting on chat.mozilla.org. You can also write up a bug for Socorro.
Hopefully this helps. If not, let us know!
Mozilla Thunderbird: Important: Thunderbird 102.7.0 And Microsoft 365 Enterprise Users
Update on January 31st:
We’re preparing to ship a 2nd build of Thunderbird 102.7.1 with an improved patch for the Microsoft 365 oAuth issue reported here. Our anticipated release window is before midnight Pacific Time, January 31.
Update on January 28th:
Some users still experienced issues with the solution to the authentication issue that was included in Thunderbird 102.7.1. A revised solution has been proposed and is expected to ship soon. We apologize for the inconvenience this has caused, and the disruption to your workflow. You can track this issue via Bug #1810760.
Update on January 20th:
Thunderbird 102.7.0 was scheduled to be released on Wednesday, January 18, but we decided to hold the release because of an issue detected which affects authentication of Microsoft 365 Business accounts.
A solution to the authentication issue will ship with version 102.7.1, releasing during the week of January 23. Version 102.7.0 is now available for manual download only, to allow unaffected users to choose to update and benefit from the fixes it delivers.
Please note that automatic updates are currently disabled, and users of Microsoft 365 Business are cautioned to not update.
*Users who update and encounter difficulty can simply reinstall 102.6.1. Thunderbird should automatically detect your existing profile. However, you can launch the Profile Manager if needed by following these instructions.
On Wednesday, January 18, Thunderbird 102.7.0 will be released with a crucial change to how we handle OAuth2 authorization with Microsoft accounts. This may involve some extra work for users currently using Microsoft-hosted accounts through their employer or educational institution.
In order to meet Microsoft’s requirements for publisher verification, it was necessary for us to switch to a new Azure application and application ID. However, some of these accounts are configured to require administrators to approve any applications accessing email.
If you encounter a screen saying “Need admin approval” during the login process, please contact your IT administrators to approve the client ID 9e5f94bc-e8a4-4e73-b8be-63364c29d753 for Mozilla Thunderbird (it previously appeared to non-admins as “Mzla Technologies Corporation”).
We request the following permissions:
- IMAP.AccessAsUser.All (Read and write access to mailboxes via IMAP.)
- POP.AccessAsUser.All (Read and write access to mailboxes via POP.)
- SMTP.Send (Send emails from mailboxes using SMTP AUTH.)
- offline_access
(Please note that this change was previously implemented in Thunderbird Beta, but the Thunderbird 102.7.0 release introduces this change to our stable ESR release.)
The post Important: Thunderbird 102.7.0 And Microsoft 365 Enterprise Users appeared first on The Thunderbird Blog.
Karl Dubost: Quirks, Site Interventions And Fixing Websites
Jelmer recently asked : "What is Site Specific Hacks?" in the context of the Web Inspector.
Safari Technical Preview 161 shows a new button to be able to activate or deactivate Site Specific Hacks. But what are these?
Site Specific Hacks are pieces of WebKit code (called Quirks internally) to change the behavior of the browser in order to repair for the user a broken behavior from a website.
When a site has a broken behavior and is then not usable by someone in a browser, there are a couple of choices:
- If the broken behavior is wide spread across the Web, and some browsers work with it, the standard specification and the implementation need to be changed.
- If the broken behavior is local to one or a small number of websites, there are two non-exclusive options
- Outreach
- Quirk (WebKit), Site Intervention (Gecko), Patch (Presto)
Outreach improves the Web, but it is costly in time and effort. Often it's very hard to reach the right person, and it doesn't lead necessary to the expected result. Websites have also their own business priorities.
A site specific hack or a quirk in WebKit lingo means a fix to help the browser cope with a way of coding of the Website which is failing in a specific context. They are definitely bandaid and not a strategy of development. They should be really here to give the possibility for someone using a browser to have a good and fluid experience. Ideally, outreach would be done in parallel and we would be able to remove the site specific hack after a while.
A Recent Example : FlightAware WebappI recently removed a quirk in WebKit, which was put in place in the past to solve an issue.
The bug was manifesting for a WebView on iOS applications with devices where window.devicePixelRatio is 3.
with this function
if ( (i && i.canvas.style.transform === e ? ((this.container = t), (this.context = i), (this.containerReused = !0)) : this.containerReused && ((this.container = null), (this.context = null), (this.containerReused = !1)), !this.container) ) { (n = document.createElement("div")).className = o; var a = n.style; (a.position = "absolute"), (a.width = "100%"), (a.height = "100%"); var s = (i = li()).canvas; n.appendChild(s), ((a = s.style).position = "absolute"), (a.left = "0"), (a.transformOrigin = "top left"), (this.container = n), (this.context = i); }which is comparing the equivalence of two strings: i.canvas.style.transform with the value
(matrix(0.333333, 0, 0, 0.333333, 0, 0)and e with the value
(matrix(0.3333333333333333, 0, 0, 0.3333333333333333, 0, 0)The matrix string was computed by:
function Je(t) { return "matrix(" + t.join(", ") + ")" }So these two string clearly are different, then the code above was never executing.
in CSSOM specification for serialization, it is mentionned:
<number>
A base-ten number using digits 0-9 (U+0030 to U+0039) in the shortest form possible, using "." to separate decimals (if any), rounding the value if necessary to not produce more than 6 decimals, preceded by "-" (U+002D) if it is negative.
It was not always like this, in the past. The specification got changed at a point the implementations changed, and the issue surfaced once WebKit became compliant with the specification.
The old code was like this:
e.prototype.renderFrame = function (t, e) { var r = t.pixelRatio, n = t.layerStatesArray[t.layerIndex]; !(function (t, e, r) { We(t, e, 0, 0, r, 0, 0); })(this.pixelTransform, 1 / r, 1 / r), qe(this.inversePixelTransform, this.pixelTransform); var i = Je(this.pixelTransform); this.useContainer(e, i, n.opacity); // cut for brevity }Specifically this line could be fixed like this.
})(this.pixelTransform, (1/r).toFixed(6), (1/r).toFixed(6) ),That would probably help a lot. Note that Firefox, and probably chrome may have had the same issue on devices where window.devicePixelRatio is 3.
Outreach worked and they changed the code, but in the meantime the quirk was here to help people have a good user experience.
Why Deactivating A Quirk In The Web Inspector?Why does the Web Inspector give the possibility to deactivate the site specific hacks aka Quirks?
- Web developers for the impacted websites need to know if their code fix solve the current. So it's necessary for them to be able to understand what would be the behavior of the browser without the quirk.
- Browser implementers and QA need to know if a quirk is still needed for a specific website. Deactivating them gives a quick way to tests if the quirk needs to be removed.
The main list of Quirks is visible in the source code of WebKit. If you are part of a site for which WebKit had to create a quirk, do not hesitate to contact me on GitHub or by mail or on mastodon, and we could find a solution together to remove the Quirk in question.
Otsukare!
The Servo Blog: Servo to Advance in 2023
We would like to share some exciting news about the Servo project. This year, thanks to new external funding, a team of developers will be actively working on Servo. The first task is to reactivate the project and the community around it, so we can attract new collaborators and sponsors for the project.
The focus for 2023 is to improve the situation of the layout system in Servo, with the initial goal of getting basic CSS2 layout working. Given the renewed activity in the project, we will keep you posted with more updates throughout the year. Stay tuned!
About ServoCreated by Mozilla Research in 2012, the Servo project is a Research & Development effort meant to create an independent, modular, embeddable web engine that allows developers to deliver content and applications using web standards. Servo is an experimental browser engine written in Rust, taking advantage of the memory safety properties and concurrency features of the language. Stewardship of Servo moved from Mozilla Research to the Linux Foundation in 2020, where its mission remains unchanged.
Frederik Braun: Origins, Sites and other Terminologies
In order to fully discuss security issues, their common root causes and useful prevention or mitigation techniques, you will need some common ground on the security model of the web. This, in turn, relies on various terms and techniques that will be presented in the next sections.
Feel free to …
Support.Mozilla.Org: Introducing Erik Avila
Hey folks,
I’m delighted to introduce you to Erik Avila who is joining our team as an additional Community Support Advocate. Here’s a short intro from Erik:
Hi! I’m Erik. I’ll be helping the mobile support team to moderate and send responses to app reviews, also, I’ll help identify trends to track them. I’m very excited to help and work with you all.
Erik will be helping out with the Mobile Store Support initiative, alongside with Dayana. We also introduced him in the community call last week.
Please join me to congratulate and welcome Erik!
Patrick Cloke: Researching for a Matrix Spec Change
The Matrix protocol is modified via Matrix Spec Changes (frequently abbreviated to “MSCs”). These are short documents describing any technical changes and why they are worth making (see an example). I’ve written a bunch and wanted to document my research process. [1]
Note
I treat my research as a living document, not an artifact. Thus, I don’t worry much about the format. The important part is to start writing everything down to have a single source of truth that can be shared with others.
First, I write out a problem statement: what am I trying to solve? (This step might seem obvious, but it is useful to frame the technical changes in why they matter. Many proposals seem to skip this step.) Most of my work tends to be from the point of view of an end-user, but some changes are purely technical. Regardless, there is benefit from a shared written context of the issue to be solved.
From the above and prior knowledge, I list any open questions (which I update through this process). I’ll augment the questions with answers I find in my research, write new ones about things I don’t understand, or remove them as they become irrelevant.
Next, I begin collecting any previous work done in this area, including:
What is the current specification related to this? I usually pull out blurbs (with links back to the source) from the latest specification.
Are there any related known issues? It is also worth checking the issue trackers of projects: I start with the Synapse, Element Meta, and Element Web repositories.
Are there related outstanding MSCs or previous research? I search the matrix-spec-proposals repository for keywords, open anything that looks vaguely related and then crawl those for mentions of other MSCs. I’ll document the related ones with links and a brief description of the proposed change.
I include both proposed and closed MSCs to check for previously rejected ideas.
Are others interested in this? Have others had conversation about it? I roughly follow the #matrix-spec room or search for rooms that might be interested in the topic. I would recommend joining the #matrix-spec room for brainstorming and searching.
This step can help uncover any missed known issues and MSCs. I will also ask others with a longer history in the project if I am missing anything.
A brief competitive analysis is performed. Information can be gleaned from technical blog posts and API documentation. I consider not just competing products, but also investigate if others have solved similar technical problems. Other protocols are worth checking (e.g. IRC, XMPP, IMAP).
You can see an example of my research on Matrix read receipts & notifications.
Once I have compiled the above information, I jump into the current implementation to ensure it roughly matches the specification. [2] I start considering what protocol changes would solve the problem and are reasonable to implement. I find it useful to write down all of my ideas, not just the one I think is best. [3]
At this point I have:
- A problem statement
- A bunch of background about the current protocol, other proposed solutions, etc.
- A list of open questions
- Rough ideas for proposed solutions
The next step is to iterate with my colleagues: answer any open questions, check that our product goals will be met, and seek agreement that we are designing a buildable solution. [4]
Finally, I take the above and formalize it in into one or more Matrix Spec Changes. At this point I’ll think about error conditions / responses, backwards compatibility, security concerns, and any other parts of the full MSC. Once it is documented, I make a pull request with the proposal and self-review it for loose ends and clarity. I leave comments for any parts I am unsure about or want to open discussion on.
Then I ask me colleagues to read through it and wait for feedback from both them and any interested community members. It can be useful to be in the #matrix-spec room as folks might want to discuss the proposal.
[1]There’s a useful proposal template that I eventually use, but I do much of this process before constraining myself by that. [2]This consists of looking through code as well as just trying it out by manually making API calls or understanding how APIs power product features. [3]Part of the MSC proposal is documenting alternatives (and why you didn’t choose one of those). It is useful to brainstorm early before you’re set in a decision! [4]I usually do work with Matrix homeservers and am not as experienced with clients. It is useful to bounce ideas off a client developer to understand their needs.Firefox Add-on Reviews: Roblox browser extensions for better gaming
Every day, more than 50 million people play among millions of user-created games on Roblox. With a massive global audience and an ocean of games, there are vastly different ways users like to interact with Roblox. This is where the customization power of browser extensions can shine. If you’re a Roblox player or creator, you might be intrigued to explore some of these innovative extensions built just for Roblox users on Firefox.
BTRobloxPacked with features, BTRoblox can do everything from altering Roblox’s looks to providing potent new functionality.
Our favorite BTRoblox features include…
- Change the look. BTRoblox not only offers a bunch of custom themes to choose from (dark mode is always nice for game screen contrast), but you can even rearrange the site’s main navigation buttons and — perhaps best of all — hide ads.
- Currency converter. Get Robux converted into your currency of choice.
- Fast user search. Type in the search field and see auto-populated username returns.
“Amazing extension. I can’t use Roblox anymore without this extension installed. That’s how big a difference it makes.” — Firefox user whosblue.

Another deep customization extension, RoGold offers a few familiar tricks like a currency converter and custom themes, but it really distinguishes itself with a handful of unique features.
Notably unique RoGold features include…
- Pinned games. Easily access your favorite content from a pin board.
- Live server stats. See FPS and ping rates instantly.
- Streamer mode. Play privately to avoid recognition.
- Bulk asset upload. Great for game creators, you can upload a huge number of decals at once (more asset varieties expected to be added over time).
- Original finder. Helps you identify original assets and avoid knock-offs prone to suddenly disappearing.
- View banned users. What a curious feature — it displays hidden profiles of banned users.
Sometimes you just need to find a Roblox game with enough room for you and a few friends. Roblox Sever Finder is ideal for that. This single feature extension simply informs you of the number of players on any public server, so you’ll know precisely which server can accommodate your party.
Very easy to use extension. Just select your preferred number of players with the slider and hit Smart Search. You’re good to go!
“A stress reliever for searching servers!” — Firefox user mico
Friend Removal ButtonThis feature isn’t as sad as it sounds! Roblox puts a cap on the number of “friends” you’re allowed to connect with. But over time Roblox players come and go, accounts get abandoned, things happen. Then you’re left with a bunch of meaningless “friends” that clog your ability to form new Roblox connections. Friend Removal Button can help.
The extension adds a red-mark button to each friend card so you can easily prune your friend list anytime.
“Finally I don’t have max friends, thanks.” Niksky6
Roblox URL LauncherUse URL links to conveniently join games, servers, or studio areas. Roblox URL Launcher can help with a slew of situations.
Roblox URL Launcher can help in these cases…
- Easily follow friends into live games with just a link.
- Go to a specific area of the studio.
- Join a server directly (also works with private servers if you have access).
Hopefully you found an extension that will enhance your Roblox experience on Firefox! Do you also play on Steam? If so, check out these excellent Steam extensions.
The Rust Programming Language Blog: Announcing Rust 1.66.1
The Rust team has published a new point release of Rust, 1.66.1. Rust is a programming language that is empowering everyone to build reliable and efficient software.
If you have a previous version of Rust installed via rustup, you can get 1.66.1 with:
rustup update stableIf you don't have it already, you can get rustup from the appropriate page on our website, and check out the detailed release notes for 1.66.1 on GitHub.
What's in 1.66.1 stableRust 1.66.1 fixes Cargo not verifying SSH host keys when cloning dependencies or registry indexes with SSH. This security vulnerability is tracked as CVE-2022-46176, and you can find more details in the advisory.
Contributors to 1.66.1Many people came together to create Rust 1.66.1. We couldn't have done it without all of you. Thanks!
The Rust Programming Language Blog: Security advisory for Cargo (CVE-2022-46176)
This is a cross-post of the official security advisory. The official advisory contains a signed version with our PGP key, as well.
The Rust Security Response WG was notified that Cargo did not perform SSH host key verification when cloning indexes and dependencies via SSH. An attacker could exploit this to perform man-in-the-middle (MITM) attacks.
This vulnerability has been assigned CVE-2022-46176.
OverviewWhen an SSH client establishes communication with a server, to prevent MITM attacks the client should check whether it already communicated with that server in the past and what the server's public key was back then. If the key changed since the last connection, the connection must be aborted as a MITM attack is likely taking place.
It was discovered that Cargo never implemented such checks, and performed no validation on the server's public key, leaving Cargo users vulnerable to MITM attacks.
Affected VersionsAll Rust versions containing Cargo before 1.66.1 are vulnerable.
Note that even if you don't explicitly use SSH for alternate registry indexes or crate dependencies, you might be affected by this vulnerability if you have configured git to replace HTTPS connections to GitHub with SSH (through git's url.<base>.insteadOf setting), as that'd cause you to clone the crates.io index through SSH.
MitigationsWe will be releasing Rust 1.66.1 today, 2023-01-10, changing Cargo to check the SSH host key and abort the connection if the server's public key is not already trusted. We recommend everyone to upgrade as soon as possible.
Patch files for Rust 1.66.0 are also available here for custom-built toolchains.
For the time being Cargo will not ask the user whether to trust a server's public key during the first connection. Instead, Cargo will show an error message detailing how to add that public key to the list of trusted keys. Note that this might break your automated builds if the hosts you clone dependencies or indexes from are not already trusted.
If you can't upgrade to Rust 1.66.1 yet, we recommend configuring Cargo to use the git CLI instead of its built-in git support. That way, all git network operations will be performed by the git CLI, which is not affected by this vulnerability. You can do so by adding this snippet to your Cargo configuration file:
[net] git-fetch-with-cli = true AcknowledgmentsThanks to the Julia Security Team for disclosing this to us according to our security policy!
We also want to thank the members of the Rust project who contributed to fixing this issue. Thanks to Eric Huss and Weihang Lo for writing and reviewing the patch, Pietro Albini for coordinating the disclosure and writing this advisory, and Josh Stone, Josh Triplett and Jacob Finkelman for advising during the disclosure.
Updated on 2023-01-10 at 21:30 UTC to include additional mitigations.
Pagina's
- 1
- 2
- 3
- 4
- volgende ›
- laatste »