Roundcube Webmail



$ git clone
$ cd roundcubeemail
$ scc . | head -8
Language                 Files     Lines   Blanks  Comments     Code Complexity
PHP                        526    123939    18225     28447    77267      13323
SQL                        110      2642      419       238     1985          0
JavaScript                 100     29353     3617      2800    22936       4827
HTML                        50      2738      304        31     2403          0
Shell                       21      2432      345        50     2037        323



Q: What are we missing here?

A: Input sanitisation

  • The attacker sends a POST request to the installer:

    POST /roundcube/installer/index.php HTTP/1.1
    Content-Type: application/x-www-form-urlencoded
    Content-Length: 1049
  • The attacker sends an email containing an image of non-standard format.

  • Roundcube will try to convert the image to JPG.

  • The command stored in _im_convert_path will be executed.

  • The attacker will have a reverse shell.

From DrunkenShells's Disclosures repository


But ... Was it preventable?

Yes, but ..

Not with standard linters or scanners

private static function getCommand($opt_name)
    static $error = [];

    $cmd = rcube::get_instance()->config->get($opt_name);

    if (empty($cmd)) {
        return false;

    if (preg_match('/^(convert|identify)(\.exe)?$/i', $cmd)) {
        return $cmd;

    // Executable must exist, also disallow network shares on Windows
    if ($cmd[0] != "\\" && file_exists($cmd)) {
        return $cmd;

    if (empty($error[$opt_name])) {
        rcube::raise_error("Invalid $opt_name: $cmd", true, false);
        $error[$opt_name] = true;

    return false;

From program/lib/Roundcube/rcube_image.php

Taint analysis

  • Following the program's execution flow and looking for:
    • Attacker-controlled data: rcube::get_instance()->config
    • Sensitive sink: return
  - id: return-unsanitised-config
      - php
    message: A value taken from the configuration is returned without sanitisation.
    mode: taint
      - patterns:
        - pattern: rcube::get_instance()->config->get($KEY);
      - pattern: escapeshellcmd(...)
    - patterns:
      - pattern-regex: "return"
    severity: ERROR

A Semgrep rule using taint tracking

private static function getCommand($opt_name)
    static $error = [];

    $cmd = rcube::get_instance()->config->get($opt_name);

    if (empty($cmd)) {
        return false;

    if (preg_match('/^(convert|identify)(\.exe)?$/i', $cmd)) {
        return $cmd;

    // Executable must exist, also disallow network shares on Windows
    if ($cmd[0] != "\\" && file_exists($cmd)) {
        return $cmd;


The Open Source Fortress

  • Collection of OSS tools that can be used to proactively detect vulnerabilities
  • Structure
    • Factual information
      • General software and software security topics
      • Brief presentation of each analysis technique
    • Practical examples for analysing a vulnerable codebase
      • Infrastructure and access
      • Documentations
      • Proposed solutions

But why open source?

  • Second layer of security when used with paid products
  • Replacement for paid products
  • Lower engineering effort compared with in-house solutions
  • Default collaboration


Defensive activities

The examples are from the Log4Shell vulnerability in Log4j.

Offensive activities

  • Exploit writing
    • Attack vector: through VMware Horizon
    • Mitigation bypass: T1036.004
    • Weaponisation: T1573.001
  • Exploitation

As reported by CISA in AA22-174A

Sand Castle

  • Vulnerable-by-design codebase
  • "lightweight piece of software that runs on a Debian-based server and allows users to control it through their browsers"
  • On-premise deployment
  • Written in Python and C
  • 12+ embedded vulnerabilities

Threat modelling

  • Identifying asset and threats
    • What we need to defend?
    • What can go wrong?
  • Legal requirement (e.g., USA and Singapore)

OWASP Threat Dragon

  • Threat modelling tool backed by OWASP
  • Usual process
    1. Threat model creation
    2. Diagram creation: STRIDE, CIA
    3. Asset representation: stores, process, actor, data flow, trust boundaries
    4. Manual threat identification, with type, status, score, priority, description, and mitigation


Code querying

  • Searching a specific pattern in the codebase
  • Optional abstract representation of the codebase
    • Abstract syntax trees
    • Control flow graphs
  • Query types
    • Literals: scanf
    • Regex: scanf\(.*\)
    • Data structures: ({cpg.method("(?i)scanf").callIn}).l in Joern's CPGQL
  • Community queries (but generic)


From Trail of Bit's "Fast and accurate syntax searching for C and C++"

$ pip install semgrep
- id: secret-logging
    - pattern-either:
        - pattern: $LOGGING_LIB.$METHOD(..., $MESSAGE, ...)
    - metavariable-pattern:
        metavariable: $LOGGING_LIB
            - pattern-either:
                - pattern: logging
                - pattern: logger
    - metavariable-pattern:
        metavariable: $MESSAGE
            - pattern-either:
                - pattern: <... password ...>
                - pattern: <... token ...>
            - pattern-not: |

$ semgrep scan                              \
  --sarif                                   \
  --config ~/analysis/semgrep-rules         \
  --output ~/analysis/semgrep.custom.sarif  \
│ Scan Status │
  Scanning 17 files (only git-tracked) with 4 Code rules:


│ Scan Summary │
Some files were skipped or only partially analyzed.
  Scan was limited to files tracked by git.

Ran 4 rules on 11 files: 9 findings.
    f"Authenticating user with credentials: {username}:{password}"



  • (Partially) open-source code scanner
  • Support for 30+ programming languages
  • No prior build requirements
  • No DSL for rules
  • Default or third-party rules



  • Running a program and offering random, unexpected inputs
  • A crash = a security issue
    • *NULL
    • Sanitizers: ASan, UBSan, etc.
  • BFS traversal of the CFG
  • Optimisations


From AdaCore's "Finding Vulnerabilities using Advanced Fuzz testing and AFLplusplus v3.0"

$ docker exec -it aflplusplus/aflplusplus /bin/bash
int main(int argc, char *argv[]) {
  int length, read_length;
  char *buffer, *filename;

  if (argc != 2){
    return 1;

  filename = argv[1];
  FILE * f = fopen (filename, "rb");

  fseek (f, 0, SEEK_END);
  length = ftell (f);
  fseek (f, 0, SEEK_SET);
  buffer = malloc (length);
  fread (buffer, 1, length, f);
  fclose (f);

  generate_recovery_token(buffer + 4, buffer);

  return 0;
$ AFL_USE_ASAN=1 /AFLplusplus/afl-cc            \
  -g                                            \
  -o crash_me_if_u_can.elf                      \
  generate_recovery_token.c sha256.c harness.c
$ afl-fuzz                              /
  -i ~/analysis/afl++/c_modules/inputs  /
  -o ~/analysis/afl++/c_modules/outputs /
  --                                    /
  ./crash_me_if_u_can.elf @@
    american fuzzy lop ++4.09a {default} (./crash_me_if_u_can.elf) [fast]
┌─ process timing ────────────────────────────────────┬─ overall results ────┐
│        run time : 0 days, 0 hrs, 0 min, 0 sec       │  cycles done : 0     │
│   last new find : none seen yet                     │ corpus count : 1     │
│last saved crash : 0 days, 0 hrs, 0 min, 0 sec       │saved crashes : 1     │
│ last saved hang : none seen yet                     │  saved hangs : 0     │
├─ cycle progress ─────────────────────┬─ map coverage┴──────────────────────┤
│  now processing : 0.2 (0.0%)         │    map density : 26.79% / 26.79%    │
│  runs timed out : 0 (0.00%)          │ count coverage : 5.27 bits/tuple    │
├─ stage progress ─────────────────────┼─ findings in depth ─────────────────┤
│  now trying : havoc                  │ favored items : 1 (100.00%)         │
│ stage execs : 151/459 (32.90%)       │  new edges on : 1 (100.00%)         │
│ total execs : 173                    │ total crashes : 1 (1 saved)         │
│  exec speed : 99.88/sec (slow!)      │  total tmouts : 19 (0 saved)        │
├─ fuzzing strategy yields ────────────┴─────────────┬─ item geometry ───────┤
│   bit flips : disabled (default, enable with -D)   │    levels : 1         │
│  byte flips : disabled (default, enable with -D)   │   pending : 0         │
│ arithmetics : disabled (default, enable with -D)   │  pend fav : 0         │
│  known ints : disabled (default, enable with -D)   │ own finds : 0         │
│  dictionary : n/a                                  │  imported : 0         │
│havoc/splice : 1/12, 0/0                            │ stability : 100.00%   │
│py/custom/rq : unused, unused, unused, unused       ├───────────────────────┘
│    trim/eff : 20.00%/1, disabled                   │          [cpu001:350%]
└─ strategy: explore ────────── state: started :-) ──┘

server_recovery_passphrase = getenv("SANDCASTLE_RECOVERY_PASSPHRASE");
if (server_recovery_passphrase == NULL)
  return NULL;

passphrase_len = strlen(server_recovery_passphrase) - 1;

buf = (BYTE *)malloc(SHA256_BLOCK_SIZE * sizeof(BYTE));
if (!buf)
  return NULL;

// Prevent buffer overflow by allocating more
hashed_len = length + passphrase_len;
hashed = (BYTE *)malloc(10 * hashed_len * sizeof(BYTE));
if (!hashed){

  return NULL;

strcpy(hashed, server_recovery_passphrase);
strcpy(hashed + passphrase_len, data);




Secret scanning

  • Secrets
    • API keys
    • Credentials
    • Tokens
  • Searching for specific patterns or entropy for a secret
  • Community (generic) rules

Download a binary from the GitHub releases.

$ gitleaks                                \
  --no-banner                             \
  detect                                  \
  --report-format sarif                   \
  --source ~/codebase                     \
  --report-path ~/analysis/gitleaks.sarif \
5:48PM INF 68 commits scanned.
5:48PM INF scan completed in 196ms   
5:48PM WRN leaks found: 5

app = Flask(__name__)

app.secret_key = (

LOG_LOCATION = "/var/log/sandcastle.log"



  • Detector for hard-coded secrets
  • Analysis of the entire Git history
  • Support for baselines and custom formats of secrets


Dependency scanning

  • Iterating through all dependencies for finding their vulnerabilities
  • Usage of the dependencies declaration list

Download a binary from the GitHub releases.

$ osv-scanner                                     \
  --lockfile ~/codebase/sandcastle/poetry.lock 
Scanned ~/codebase/sandcastle/poetry.lock file and found 23 packages
│ OSV URL                             │ CVSS │ ECOSYSTEM │ PACKAGE  │ VERSION │ SOURCE                          │
│ │      │ PyPI      │ pillow   │ 9.5.0   │ codebase/sandcastle/poetry.lock │
│ │ 8.8  │ PyPI      │ pillow   │ 9.5.0   │ codebase/sandcastle/poetry.lock │
│      │      │ PyPI      │ pillow   │ 9.5.0   │ codebase/sandcastle/poetry.lock │
│ │ 8    │ PyPI      │ werkzeug │ 3.0.0   │ codebase/sandcastle/poetry.lock │
│      │      │           │          │         │                                 │

python = "^3.10"
Flask = "^2.3.3"
python-pam = "^2.0.2"
six = "^1.16.0"
pillow = "^9.5.0"

10.0.1 (2023-09-15)

    Updated libwebp to 1.3.2 #7395 [radarhere]
    Updated zlib to 1.3 #7344 [radarhere]




  • Static analysis for finding issues before compiling/running the code
  • Issues
    • Formatting
    • Grammar (for example, non-inclusive expressions)
    • Security
$ pip install bandit
$ bandit                                          \
  --recursive ~/codebase/sandcastle/sandcastle/   \
  --format sarif                                  \
  --o ~/analysis/bandit.sarif
[main]  INFO    profile include tests: None
[main]  INFO    profile exclude tests: None
[main]  INFO    cli include tests: None
[main]  INFO    cli exclude tests: None
[main]  INFO    running on Python 3.11.6
[formatter]     INFO    SARIF output written to file: /home/iosifache/analysis/bandit.sarif
for tarinfo in tar:
    name =
    if tarinfo.isreg():
            filename = f"{extract_dir}/{name}"
            os.rename(os.path.join(tmp, name), filename)

        except Exception:

    os.makedirs(f"{extract_dir}/{name}", exist_ok=True)


  • Linter for Python
  • Abstract syntax tree representation of the code
  • Custom modules for:
    • Patterns of suspicious code
    • Deny lists of imports and function calls
    • Report generation
  • Support for baselines


Symbolic execution for taint analysis

  • Investigating all CFG paths by replacing the concrete values with symbolic ones
  • Components
    • Sources
    • Sinks
    • Patterns
  • Path explosion problem
int f(int a, int b){
    int x = 1, y = 0;

    if (a != 0) {
        y = x + 3;
        if b == 0 {
            x = 2 * (a + b);

    return (a + b) / (x - y);


From symflower's "What is symbolic execution for software programs"

$ docker exec -it klee/klee /bin/bash
int main() {
  char re[10];
  int count;

  klee_make_symbolic(re, sizeof re, "re");
  re[9] = '\0';

  klee_make_symbolic(&count, sizeof(int), "count");

  generate_recovery_token(re, count);

  return 0;
$ clang               \
  -emit-llvm          \
  -c                  \
  -g                  \
  -O0                 \
  -Xclang             \
  -disable-O0-optnone \
  -I .                \
  source.c            \
  -o source.bc
$ klee source.bc
KLEE: NOTE: found huge malloc, returning 0
KLEE: ERROR: source.c:216: concretized symbolic size
KLEE: NOTE: now ignoring this error at this location
KLEE: WARNING ONCE: calling external: strcpy(94204336258496, 94204335341000) at source.c:224 10
KLEE: ERROR: source.c:118: memory error: out of bound pointer
KLEE: NOTE: now ignoring this error at this location


  • Generic symbolic execution with security use cases
  • Built on LLVM

Other techniques


Security tooling automation


Do security-focused work!

  • Create a threat model.
  • Do a security review and report your findings.
  • Implement new security mitigations.
  • Propose or backport patches.
  • Create new workflows for security scanning.
  • Integrate the project in OSS-Fuzz.



  • Give it a GitHub star.
  • Share it with your friends or followers.
  • Write a short feedback email to the maintainers.


