CCI and Metadata
CCI represents an assembly’s metadata through the CCI Metadata model. The associated CCI Metadata API is used by all CCI applications, even those that use CCI Code to work with code blocks. The following sections describe the basics of CCI Metadata.
Many aspects of applications are specific to the application environment. For example:
- Applications can target any of several mscorlib.dll versions. Typically, applications specify unification, which directs .NET to use the most recent version on the system, but they can also target a specific version.
- Applications can be stored in a variety of ways. Typically, applications are stored on the hard drive, but they could also be stored in a database, downloaded from a Web site, and so on. Applications can even be embedded in a Word or Excel document.
The CCI libraries are not associated with any particular application environment. For example, you can use CCI to work with assemblies from any kind of storage. However, the CCI libraries require application-specific information such as where files are located,
where to find referenced assemblies, or the targeted .NET version.
From a CCI perspective, such issues are matters of application policy, and are handled by a separate host object that understands the application environment. CCI queries the host for application-specific information. The host handles the details and returns
the results to CCI through a standard interface. For example, when CCI must locate an assembly, it calls the host’s
method and passes it a string that defines the assembly. The string is typically a file path, but it could represent another type of storage, such as a SQL query to retrieve the assembly from a database.
interprets the string appropriately and retrieves the assembly.
CCI provides a default host object that is sufficient for many applications. If your application requires non-standard support, such as handling files that are not stored on the hard drive or specifying a particular mscorlib.dll version, you must implement
a custom host. For more details, see
Applications typically use a single host object, but there are some cases that require multiple hosts. For example, a host object can represent only one instance of mscorlib.dll. If you want to compare two mscorlib.dll versions you need a separate host object
for each DLL.
Comparing Strings and Types
CCI must often test strings or types for equality. CCI supports two objects, NameTable
, which improve the performance of such tests.
The simplest way to test strings for equality, which is used by many .NET methods, is a character-by-character comparison. However, this approach is not very efficient. CCI Metadata improves the efficiency of string comparison by using a NameTable object.
is a container for a collection of key-value pairs. Each value is a string and the associated key is a unique integer. Once you have added the relevant strings to a
, you can test for equality by comparing two integers, which is much faster than character-by-character comparison.
automatically creates a NameTable
object, and stores it in its
property, which is sufficient for most applications. However, there are some applications that use multiple hosts. In that case, you create your own
object and pass it to each DefaultHost
constructor when you create each new host object. That ensures that every host uses the same key-value pairs.
Often, there is only one object per type, so you can compare types by comparing the object identities. However, applications can also have references to types, which are contained in reference objects. For example, you use a reference object to reference a
type in another assembly.
Two type references might not be for the same object, so you can’t simply resolve the references and compare object identity. To determine whether two objects are instances of the same type, you must compare the objects’ structure, which is relatively expensive.
CCI uses an InternFactory
object to improve the performance of type comparisons.
works somewhat like NameTable
. When CCI constructs a type, it uses an
to determine the type structure, and assigns a unique integer to the type. You can then use that integer for all subsequent comparisons, which is much faster than comparing structure.
Mutable and Immutable Representations
CCI Metadata provides two ways to represent a PE file, mutable and immutable.
The Mutable Representation
The mutable representation is an object model that represents the contents of a PE file. You can use the properties and methods exposed by mutable objects to modify the assembly. For example, a method’s metadata is represented by a
object, which applications can use to modify a method’s metadata. The objects that support the mutable object model are in the
namespace is a somewhat confusing historical artifact. It actually supports the CCI Metadata mutable representation. The objects that support CCI Code are in the
The Immutable Representation
The immutable representation is a set of interfaces, each of which provides read-only access to a corresponding mutable object’s properties. An immutable interface uses the mutable class name, prefaced by “I”. For example, the immutable interface for MethodDefinition
is I MethodDefinition. The interfaces that support the immutable representation are in the Microsoft.Cci namespace. The immutable representation includes everything in a PE file except:
- Data that is the same for all PE files.
- Data that can be derived from other information in the file.
The objects that make up the mutable representation all have immutable interfaces, so the immutable representation is essentially a passive data structure that provides read-only access to the corresponding mutable representation.
How to Create a Mutable Representation
Applications typically start with an immutable representation of a PE file. For example, CCI Metadata applications often use
to load an assembly from the hard drive. LoadUnitFrom
returns an immutable representation of the assembly,
The immutable representation is sufficient for most analysis applications, which just need information from the assembly. Any application that modifies the assembly, such as rewriting applications, must work with a mutable representation. There are two approaches
to obtaining a mutable representation:
- Create a mutable copy of an immutable representation. CCI provides objects, called mutators, which traverse an immutable representation and produce a mutable copy.
- Create the assembly from scratch. The
HelloIL Sample Walkthrough describes a simple example of how to use CCI Metadata to create an assembly.
The Immutable Contract
When you pass all or part of an object model to a method, you always pass the immutable representation, even if you are working with a mutable representation. For example, CCI applications typically store an assembly by passing its
interface to the Microsoft.Cci.MutableCodeModel.PeWriter
method, which converts the assembly's object model representation to the PE format and stores it as a PE file.
When you pass an immutable interface to a method, you implicitly accept a contract to make no further changes to the underlying object model. This contract allows methods to safely cache property values, multiple threads to safely access the object model without
obtaining locks, and so on. Follow these basic guidelines for using the mutable and immutable representations:
Next: Reading and Writing PE and PDB Files
Return to Beginning
- If an object model is in flux, don’t pass it to anyone; keep the object model private until it is stable.
- Once you have passed an object model to someone, you’ve effectively given up ownership; do not make any further changes.