Tag Archives: Apache

Using ShiftLeft in Open Source

After years of being involved in open-source development at Apache, we’ve seen security issues pop up in Apache Commons like arbitrary remote code execution, and denial of service attacks (CVE-2016-3092 and CVE-2014-0050). While some threats are real, other are just FUD. Even when they are real, it is important to consider context. There may be problems that end users never see because the “hole” is not reachable by users’ actions.

I started to wonder how we can short-circuit FUD and proactively deal with security issues before we hear about them in the press and in panicked mailing list posts. Is there something we can do at build time? A FindBugs or PMD on steroids? Surely someone else must be thinking about this. Can we apply the principle of shift left testing to security?

It turns out there is a tool out there that does and it’s called, not coincidentally, ShiftLeft.io. The idea behind ShiftLeft is to break old habits of building a product and then, later, figuring out how to fend off attacks and plug up security holes. Today, we take for granted that unit testing, integration testing, continuous integration, and continuous delivery are common place. ShiftLeft propose to make security analysis as ubiquitous.

Getting started

Since ShiftLeft is free for open source projects, I decided to look what it reports for Apache Commons IO 2.5, an Apache Commons Java component. While I won’t divulge right now any real security issues, I’ll show what is safe to report about Commons IO. I’ll also show the cool stuff ShiftLeft points out to write more bullet-proof software. To use ShiftLeft, you go to their web site and submit the URL to your GitHub repository. ShiftLeft has a 30 day disclosure policy so you have plenty of time to fix up your FOSS project.

What I got is a JSON report file containing all sorts of data and what ShiftLeft calls conclusions. These are the potentially actionable items. As we’ll see, even if you find some conclusions non-actionable, these will do a great deal to raise your awareness of potential security issues for code that you’ll write tomorrow or need to maintain today.

Let’s start with a simple conclusion and get deeper in the weeds after that.

No memory for you!

The following example shows what “untrusted” data is and how it can affect your application.

I have a ShiftLeft conclusion that tells me the method IOUtils.buffer(Writer, int) does not support handling untrusted data to be passed as parameter size because it controls the size of a buffer, giving an attacker the chance to starve the system of memory:

/**
 * Returns the given Writer if it is already a {@link BufferedWriter}, otherwise creates a BufferedWriter from the
 * given Writer.
 *
 * @param writer the Writer to wrap or return (not null)
 * @param size the buffer size, if a new BufferedWriter is created.
 * @return the given Writer or a new {@link BufferedWriter} for the given Writer
 * @throws NullPointerException if the input parameter is null
 * @since 2.5
 */
 public static BufferedWriter buffer(final Writer writer, int size) {
  return writer instanceof BufferedWriter ? (BufferedWriter) writer : new BufferedWriter(writer, size);
 }

While this example may seem trivial, ShiftLeft shows understanding of what the code does in this method: We are allowing call sites to control memory usage in an unbounded manner. ShiftLeft gives us the exact method and line number:

While some ShiftLeft conclusions may not all be actionable, I must say that the report has made me more aware of what to potentially secure when writing new code or maintaining old an code base.

zzfireplaceTangent: What could you do in this case? You could hard-code an upper bound of say 10 MB. You could make the bound configurable with a static non-final variable (effectively creating a global variable, an anti-pattern for sure.) You could move all the static methods to the instance side, create a memory profile class to define this bound, and then build IOUtils instances with this profile in the constructor. In addition, you could also use a different buffer class internally to make sure the buffer does not grow beyond a given size. And so on. I am not proposing these changes in the context of the Apache Commons IO 2.x line since we take great care to maintain binary compatibility within a major release across all of Apache Commons. But I can definitively see proposing changes for 3.0!

ShiftLeft’s understanding of code is deeper than this example thanks to Enhanced Code Property Graphs so let’s look at a more complex example.

Digging deeper

Here is a conclusion in raw form (edited for brevity):

{
 "id": "org.apache.commons.io.FileUtils.copyFileToDirectory:void(java.io.File,java.io.File)/srcFile/2",
 "description": "The method `copyFileToDirectory` does not support handling **sensitive data** to be passed as parameter `srcFile` because it is leaked over I/O **File**.",
 "unsupportedDataType": "SENSITIVE",
 "interfaceId": "FILE/false",
 "methodId": "org.apache.commons.io.FileUtils.copyFileToDirectory:void(java.io.File,java.io.File)",
 "codeLocationUrl": "https://github.com/apache/commons-io/blob/commons-io-2.5/src/main/java/org/apache/commons/io/FileUtils.java#L1141",
 "state": "NEUTRAL",
 "externalIssueUrl": "https://todo"
 }

Looking at the methodId tells us to go look at FileUtils.copyFileToDirectory(File, File) where we find:

/**
 * Copies a file to a directory preserving the file date.
 *

 * This method copies the contents of the specified source file
 * to a file of the same name in the specified destination directory.
 * The destination directory is created if it does not exist.
 * If the destination file exists, then this method will overwrite it.
 *

 * <strong>Note:</strong> This method tries to preserve the file's last
 * modified date/times using {@link File#setLastModified(long)}, however
 * it is not guaranteed that the operation will succeed.
 * If the modification operation fails, no indication is provided.
 *
 * @param srcFile an existing file to copy, must not be {@code null}
 * @param destDir the directory to place the copy in, must not be {@code null}
 *
 * @throws NullPointerException if source or destination is null
 * @throws IOException if source or destination is invalid
 * @throws IOException if an IO error occurs during copying
 * @see #copyFile(File, File, boolean)
 */
 public static void copyFileToDirectory(final File srcFile, final File destDir) throws IOException {
  copyFileToDirectory(srcFile, destDir, true);
 }

This method just delegates to another copyFileToDirectory() with an added parameter, no big deal. What is interesting is that the codeLocationUrl points to code not in this method but to a private utility method:

https://github.com/apache/commons-io/blob/commons-io-2.5/src/main/java/org/apache/commons/io/FileUtils.java#L1141

FileUtils at line 1141 is in the guts of a private method called org.apache.commons.io.FileUtils.doCopyFile(File, File, boolean) which is where ShiftLeft flags an issue where the method creates a new FileInputStream. To be honest, this is a beta and I am not convinced that the line numbers are always dead on. What is important here is that ShiftLeft is working with a code graph. Therefore, when I search the JSON conclusions for this URL, I find a total of 14 conclusions that use this URL. This tells me that this code fragment creates 14 possible attack points in the component; with a careful emphasis on possible since context is important.

Another key point is to realize that there are two key technologies at work here and that I expect both to get better as the beta progresses: First, building a code graph to give us the power to see that once a problem has been identified on a line of code, that all (I assume public) call-sites can be flagged. Second, what constitutes a problem or a conclusion in ShiftLeft’s neutral parlance will improve and be configurable, filterable and sortable.

In this example, the conclusion description reads:

The method `copyFileToDirectory` does not support handling **sensitive data** to be passed as parameter `srcFile` because it is leaked over I/O **File**.

What goes through my head when I read that is: Yeah, I do not want just anybody to be able to copy any file anywhere like overwriting a password vault a la copyFileToDirectory(myFile, "/etc/shadow"). Granted, Commons IO is a library, not an application, so there is no alarm bell to ring here, but you get the idea.

Stepping back, I think it is important to reiterate what happened here: ShiftLeft found an issue (less dramatic than a problem) on a line of code in a private methods, then, using its code graph, created conclusions (report items) for each public facing method that may eventually call this private method in its code path.

Are you Trusted and Sensitive?

ShiftLeft uses two terms in conclusion descriptions that I want to review. Based on the limited subset of 255 (a very computer friendly number!) conclusions I saw for all of Commons IO, we can see two types of issues: trusted and sensitive.

sensitive-data-in-the-cloud-blog-image-1Conclusions described as dealing with sensitive data: This says: Lookout, if you have a password in this variable, it’s in plain text. Now, it’s up to me to make sure that this password does not end up in a clear text file or anywhere else that is not secure. This is where context matters, you are the SME of your code, you know how much trouble you can get yourself and your users into, ShiftLeft has no opinion, it offers ‘conclusions.’

Conclusions referring to untrusted data: This tells me I should take precautions before doing anything with that data. Should I just execute this script? Should I need to worry about JSON Hijacking? See Why does Google prepend while(1); to their JSON responses?

Flipping it around on ShiftLeft

Let’s flip it around on ShiftLeft for a minute. I am now thinking about not writing sensitive data like passwords and financial reports to disk insecurely. I know we have many API in FileUtils that write strings to files. Will ShiftLeft tell me about, for example FileUtils.write(File, CharSequence, Charset)? Here is the method I should never use to write passwords or any sensitive data:

/**
 * Writes a CharSequence to a file creating the file if it does not exist.
 *
 * @param file the file to write
 * @param data the content to write to the file
 * @param encoding the encoding to use, {@code null} means platform default
 * @throws IOException in case of an I/O error
 * @since 2.3
 */
 public static void write(final File file, final CharSequence data, final Charset encoding) throws IOException {
  write(file, data, encoding, false);
 }

This in turn calls:

/**
 * Writes a CharSequence to a file creating the file if it does not exist.
 *
 * @param file the file to write
 * @param data the content to write to the file
 * @param encoding the encoding to use, {@code null} means platform default
 * @param append if {@code true}, then the data will be added to the
 * end of the file rather than overwriting
 * @throws IOException in case of an I/O error
 * @since 2.3
 */
 public static void write(final File file, final CharSequence data, final Charset encoding, final boolean append)
 throws IOException {
  final String str = data == null ? null : data.toString();
  writeStringToFile(file, str, encoding, append);
 }

Which calls:

/**
 * Writes a String to a file creating the file if it does not exist.
 *
 * @param file the file to write
 * @param data the content to write to the file
 * @param encoding the encoding to use, {@code null} means platform default
 * @param append if {@code true}, then the String will be added to the
 * end of the file rather than overwriting
 * @throws IOException in case of an I/O error
 * @since 2.3
 */
 public static void writeStringToFile(final File file, final String data, final Charset encoding, final boolean
 append) throws IOException {
   OutputStream out = null;
   try {
     out = openOutputStream(file, append);
     IOUtils.write(data, out, encoding);
     out.close(); // don't swallow close Exception if copy completes normally
   } finally {
     IOUtils.closeQuietly(out);
   }
 }

Which calls IOUtils‘:

/**
 * Writes chars from a <code>String</code> to bytes on an
 * <code>OutputStream</code> using the specified character encoding.
 *

 * This method uses {@link String#getBytes(String)}.
 *
 * @param data the <code>String</code> to write, null ignored
 * @param output the <code>OutputStream</code> to write to
 * @param encoding the encoding to use, null means platform default
 * @throws NullPointerException if output is null
 * @throws IOException if an I/O error occurs
 * @since 2.3
 */
 public static void write(final String data, final OutputStream output, final Charset encoding) throws IOException {
   if (data != null) {
     output.write(data.getBytes(Charsets.toCharset(encoding)));
   }
 }

Knowing what I know, I expect ShiftLeft to conclude that all these methods do not support sensitive data. Working back up the stack, I find:

  • org.apache.commons.io.IOUtils.write(String, OutputStream, Charset)
    Nothing on that one; did I miss it due to the 255 conclusion limit?
  • org.apache.commons.io.FileUtils.writeStringToFile(File, String, Charset, boolean)
    Got it:

     {
     "id": "org.apache.commons.io.FileUtils.writeStringToFile:void(java.io.File,java.lang.String,boolean)/file/1",
     "description": "The method `writeStringToFile` does not support handling **untrusted data** to be passed as parameter `file` because it controls access to I/O **File** in a manner that would allow an attacker to abuse it.",
     "unsupportedDataType": "ATTACKER_CONTROLLED",
     "interfaceId": "FILE/false",
     "methodId": "org.apache.commons.io.FileUtils.writeStringToFile:void(java.io.File,java.lang.String,boolean)",
     "codeLocation": {
       "file": "org/apache/commons/io/FileUtils.java",
       "lineNumber": 360,
       "symbol": "parent"
     },
     "codeLocationUrl": "https://github.com/apache/commons-io/blob/commons-io-2.5/src/main/java/org/apache/commons/io/FileUtils.java#L360",
     "state": "NEUTRAL",
     "externalIssueUrl": "https://todo"
     }
    
  • org.apache.commons.io.FileUtils.write(File, CharSequence, Charset, boolean)
    Got it:

    {
     "id": "org.apache.commons.io.FileUtils.write:void(java.io.File,java.lang.CharSequence,java.nio.charset.Charset,boolean)/file/2",
     "description": "The method `write` does not support handling **sensitive data** to be passed as parameter `file` because it is leaked over I/O **File**.",
     "unsupportedDataType": "SENSITIVE",
     "interfaceId": "FILE/false",
     "methodId": "org.apache.commons.io.FileUtils.write:void(java.io.File,java.lang.CharSequence,java.nio.charset.Charset,boolean)",
     "codeLocation": {
       "file": "org/apache/commons/io/FileUtils.java",
       "lineNumber": 355,
       "symbol": "parent"
     },
     "codeLocationUrl": "https://github.com/apache/commons-io/blob/commons-io-2.5/src/main/java/org/apache/commons/io/FileUtils.java#L355",
     "state": "NEUTRAL",
     "externalIssueUrl": "https://todo"
     }
    
  • org.apache.commons.io.FileUtils.write(File, CharSequence, Charset)
    Got it:

    {
     "id": "org.apache.commons.io.FileUtils.write:void(java.io.File,java.lang.CharSequence,java.nio.charset.Charset)/file/2",
     "description": "The method `write` does not support handling **sensitive data** to be passed as parameter `file` because it is leaked over I/O **File**.",
     "unsupportedDataType": "SENSITIVE",
     "interfaceId": "FILE/false",
     "methodId": "org.apache.commons.io.FileUtils.write:void(java.io.File,java.lang.CharSequence,java.nio.charset.Charset)",
     "codeLocation": {
       "file": "org/apache/commons/io/FileUtils.java",
       "lineNumber": 355,
       "symbol": "parent"
     },
     "codeLocationUrl": "https://github.com/apache/commons-io/blob/commons-io-2.5/src/main/java/org/apache/commons/io/FileUtils.java#L355",
     "state": "NEUTRAL",
     "externalIssueUrl": "https://todo"
     }
    

So yeah, that all hangs nicely together, thank you code graphs!

Finding arbitrary code attacks

Did I mention there 255 are conclusions in the JSON report file? It takes a while to go through these. I am hoping that ShiftLeft will have their UI in place soon so I can filter and sort all this information! Now that I am about 20% through the file, I see:

The method `createSymbolicLink` does not support handling **untrusted data** to be passed as parameter `symlink` because it could allow an attacker to run arbitrary code.

runaway_train

Yikes! let’s take a deeper look and see if this is for real or a false positive.

Here is the raw conclusion:

{
 "id": "org.apache.commons.io.Java7Support.createSymbolicLink:java.io.File(java.io.File,java.io.File)/symlink/1",
 "description": "The method `createSymbolicLink` does not support handling **untrusted data** to be passed as parameter `symlink` because it could allow an attacker to run arbitrary code.",
 "unsupportedDataType": "ATTACKER_CONTROLLED",
 "methodId": "org.apache.commons.io.Java7Support.createSymbolicLink:java.io.File(java.io.File,java.io.File)",
 "codeLocation": {
 "file": "org/apache/commons/io/Java7Support.java",
 "lineNumber": 128,
 "symbol": "file"
 },
 "codeLocationUrl": "https://github.com/apache/commons-io/blob/commons-io-2.5/src/main/java/org/apache/commons/io/Java7Support.java#L128",
 "state": "NEUTRAL",
 "externalIssueUrl": "https://todo"
 }

Our codeLocationUrl for this conclusion points us to the Java7Support class at line 128: where we find:

/**
 * Indicates if a symlunk target exists
 * @param file The symlink file
 * @return true if the target exists
 * @throws IOException upon error
 */
 private static boolean exists(File file)
 throws IOException {
   try {
     Object path = toPath.invoke(file);
     final Boolean result = (Boolean) exists.invoke(null, path, emptyLinkOpts);
     return result.booleanValue();
   } catch (IllegalAccessException e) {
     throw new RuntimeException(e);
   } catch (InvocationTargetException e) {
     throw (RuntimeException) e.getTargetException();
   }
}

ShiftLeft points to the line:

Object path = toPath.invoke(file);

The instance variable toPath is a java.lang.reflect.Method which can and does execute code as shown above. Looking narrowly at the code so far we can say that yes, this code run anything since toPath is a Method.

However, widening our view to the field declaration we see the following in the class static initialization block:

toPath = File.class.getMethod("toPath");

This makes sense in the context of the class: Java7Support is used to access Java 7 features while running on pre-Java 7 platforms. Here we are setting up toPath to run one method. I would expect toPath to be a static final but it is not:

private static Method toPath;

Why is it not static? Well, it’s just that the way the static initialize block is written does not allow you to just add final to the declaration. The static block needs to be rewritten to allow for toPath to be final which we will leave as ‘an exercise to the reader’ 😉 as it is out of scope for an already long blog post.

I would be curious to see how ShiftLeft responds to such a code change.

I am not sure if this is really a problem though. The variable is private now, but not final. Yes its type (Method) is all about executing code. Under normal circumstances, this value cannot be changed outside this class. I can use reflection of course to force a new value in toPath. Does that mean that anytime I use a Method instance variable I am going to get an arbitrary code execution conclusion? Another corner-case to examine.

What if I rewrote the static block and declared the variable final. Would ShiftLeft still reach the same conclusion? If yes, would that be because I could still use reflection to hammer any value in the field.

Concluding on this first arbitrary code attack

The more I explore these test results, the more I realize how tricky security is and how much context matters. I now know that the Java7Support class in Apache Commons IO 2.5 is open to an arbitrary code attack under the narrow use case of another piece of code using reflection. But if that code is allowed to use reflection, surely it could achieve its goal without going through the extra hoop of Java7Support hacking.

Stepping back, the realization is that I should think twice about using the Method class because I could open my application up to an attack unless that Method field is properly protected.

Looking for more arbitrary code attacks

Now that ShiftLeft has whetted my appetite, I wonder if there are more arbitrary code attacks lurking. A quick search through the file reveals to total of five. Not surprisingly, these are all in the Java7Support class and all follow the same pattern as above: calling the invoke method of a Method object where the Methodis initialized in the static block.

Flipping it around once more, let’s look at the Java7Support class variable declarations and see if all Method objects end up being accounted for by ShiftLeft:

/**
 * Java7 feature detection and reflection based feature access.
 * <p/>
 * Taken from maven-shared-utils, only for private usage until we go full java7
 */
class Java7Support {

  private static final boolean IS_JAVA7;

  private static Method isSymbolicLink;

  private static Method delete;

  private static Method toPath;

  private static Method exists;

  private static Method toFile;

  private static Method readSymlink;

  private static Method createSymlink;
  ...

We have seven static Method declarations which I see initialized in the static block:

static {
 boolean isJava7x = true;
 try {
   ClassLoader cl = Thread.currentThread().getContextClassLoader();
   Class<?> files = cl.loadClass("java.nio.file.Files");
   Class<?> path = cl.loadClass("java.nio.file.Path");
   Class<?> fa = cl.loadClass("java.nio.file.attribute.FileAttribute");
   Class<?> linkOption = cl.loadClass("java.nio.file.LinkOption");
   isSymbolicLink = files.getMethod("isSymbolicLink", path);
   delete = files.getMethod("delete", path);
   readSymlink = files.getMethod("readSymbolicLink", path);
   emptyFileAttributes = Array.newInstance(fa, 0);
   createSymlink = files.getMethod("createSymbolicLink", path, path, emptyFileAttributes.getClass());
   emptyLinkOpts = Array.newInstance(linkOption, 0);
   exists = files.getMethod("exists", path, emptyLinkOpts.getClass());
   toPath = File.class.getMethod("toPath");
   toFile = path.getMethod("toFile");
   } catch (ClassNotFoundException e) {
     isJava7x = false;
   } catch (NoSuchMethodException e) {
     isJava7x = false;
   }
   IS_JAVA7 = isJava7x;
 }

ShiftLeft gives me five conclusions:

The method `isSymLink` does not support handling **untrusted data** to be passed as parameter `file` because it could allow an attacker to run arbitrary code.
The method `createSymbolicLink` does not support handling **untrusted data** to be passed as parameter `symlink` because it could allow an attacker to run arbitrary code.
The method `delete` does not support handling **untrusted data** to be passed as parameter `file` because it could allow an attacker to run arbitrary code.
The method `createSymbolicLink` does not support handling **untrusted data** to be passed as parameter `target` because it could allow an attacker to run arbitrary code.
The method `readSymbolicLink` does not support handling **untrusted data** to be passed as parameter `symlink` because it could allow an attacker to run arbitrary code.

All of the Method declarations in this class are used by all of the methods listed above. Nice.

Fin

fin_3_0I’d like to wrap up this exploration of ShiftLeft with a quick summary of what we found: a tool we can add to our build pipelines to find potential security issues. There are a lot of data here, and this is just for Apache Commons IO 2.5, so another lesson is that context matters. ShiftLeft currently provides ‘conclusions’ based on a code graph. We found conclusions about untrusted data (I’m not sure what’s in here so don’t go executing it), sensitive data (don’t save passwords in plain text!) and arbitrary code attacks.

I hope revisit this story and run ShiftLeft on other Apache Commons projects soon. This sure is fun!

Happy Coding,
Gary Gregory