1# Metalava
2
3(Also known as "doclava2", but deliberately not named doclava2 since crucially
4it does not generate docs; it's intended only for **meta**data extraction and
5generation.)
6
7Metalava is a metadata generator intended for the Android source tree, used for
8a number of purposes:
9
10* Allow extracting the API (into signature text files, into stub API files
11  (which in turn get compiled into android.jar, the Android SDK library) and
12  more importantly to hide code intended to be implementation only, driven by
13  javadoc comments like @hide, @$doconly, @removed, etc, as well as various
14  annotations.
15
16* Extracting source level annotations into external annotations file (such as
17  the typedef annotations, which cannot be stored in the SDK as .class level
18  annotations).
19
20* Diffing versions of the API and determining whether a newer version is
21  compatible with the older version.
22
23## Building and running
24
25To download the code and any dependencies required for building, see [DOWNLOADING.md](DOWNLOADING.md)
26
27To build:
28
29    $ cd tools/metalava
30    $ ./gradlew
31
32This builds a binary distribution in `../../out/host/common/install/metalava/bin/metalava`.
33
34To run metalava:
35
36    $ ../../out/host/common/install/metalava/bin/metalava
37                    _        _
38     _ __ ___   ___| |_ __ _| | __ ___   ____ _
39    | '_ ` _ \ / _ \ __/ _` | |/ _` \ \ / / _` |
40    | | | | | |  __/ || (_| | | (_| |\ V / (_| |
41    |_| |_| |_|\___|\__\__,_|_|\__,_| \_/ \__,_|
42
43    metalava extracts metadata from source code to generate artifacts such as the
44    signature files, the SDK stub files, external annotations etc.
45
46    Usage: metalava <flags>
47
48    Flags:
49
50    --help                                This message.
51    --quiet                               Only include vital output
52    --verbose                             Include extra diagnostic output
53
54    ...
55
56(*output truncated*)
57
58Metalava has a new command line syntax, but it also understands the doclava1
59flags and translates them on the fly. Flags that are ignored are listed on the
60command line. If metalava is dropped into an Android framework build for
61example, you'll see something like this (unless running with --quiet) :
62
63    metalava: Ignoring javadoc-related doclava1 flag -J-Xmx1600m
64    metalava: Ignoring javadoc-related doclava1 flag -J-XX:-OmitStackTraceInFastThrow
65    metalava: Ignoring javadoc-related doclava1 flag -XDignore.symbol.file
66    metalava: Ignoring javadoc-related doclava1 flag -doclet
67    metalava: Ignoring javadoc-related doclava1 flag -docletpath
68    metalava: Ignoring javadoc-related doclava1 flag -templatedir
69    metalava: Ignoring javadoc-related doclava1 flag -htmldir
70    ...
71
72## Features
73
74* Compatibility with doclava1: in compat mode, metalava spits out the same
75  signature files for the framework as doclava1.
76
77* Ability to read in an existing android.jar file instead of from source, which
78  means we can regenerate signature files etc for older versions according to
79  new formats (e.g. to fix past errors in doclava, such as annotation instance
80  methods which were accidentally not included.)
81
82* Ability to merge in data (annotations etc) from external sources, such as
83  IntelliJ external annotations data as well as signature files containing
84  annotations. This isn't just merged at export time, it's merged at codebase
85  load time such that it can be part of the API analysis.
86
87* Support for an updated signature file format (which is described in FORMAT.md)
88
89  * Address errors in the doclava1 format which for example was missing
90    annotation class instance methods
91
92  * Improve the signature format such that it for example labels enums "enum"
93    instead of "abstract class extends java.lang.Enum", annotations as
94    "@interface" instead of "abstract class extends java.lang.Annotation", sorts
95    modifiers in the canonical modifier order, using "extends" instead of
96    "implements" for the superclass of an interface, and many other similar
97    tweaks outlined in the `Compatibility` class. (Metalava also allows (and
98    ignores) block comments in the signature files.)
99
100  * Add support for writing (and reading) annotations into the signature
101    files. This is vital now that some of these annotations become part of the
102    API contract (in particular nullness contracts, as well as parameter names
103    and default values.)
104
105  * Support for a "compact" nullness format -- one based on Kotlin's
106    syntax. Since the goal is to have **all** API elements explicitly state
107    their nullness contract, the signature files would very quickly become
108    bloated with @NonNull and @Nullable annotations everywhere. So instead, the
109    signature format now uses a suffix of `?` for nullable, `!` for not yet
110    annotated, and nothing for non-null.
111
112    Instead of
113
114        method public java.lang.Double convert0(java.lang.Float);
115        method @Nullable public java.lang.Double convert1(@NonNull java.lang.Float);
116
117    we have
118
119        method public java.lang.Double! convert0(java.lang.Float!);
120        method public java.lang.Double? convert1(java.lang.Float);
121
122  * Other compactness improvements: Skip packages in some cases both for export
123    and reinsert during import. Specifically, drop "java.lang."  from package
124    names such that you have
125
126        method public void onUpdate(int, String);
127
128    instead of
129
130        method public void onUpdate(int, java.lang.String);
131
132    Similarly, annotations (the ones considered part of the API; unknown
133    annotations are not included in signature files) use just the simple name
134    instead of the full package name, e.g. `@UiThread` instead of
135    `@android.annotation.UiThread`.
136
137  * Misc documentation handling; for example, it attempts to fix sentences that
138    javadoc will mistreat, such as sentences that "end" with "e.g. ".  It also
139    looks for various common typos and fixes those; here's a sample error
140    message running metalava on master: Enhancing docs:
141
142        frameworks/base/core/java/android/content/res/AssetManager.java:166: error: Replaced Kitkat with KitKat in documentation for Method android.content.res.AssetManager.getLocales() [Typo]
143        frameworks/base/core/java/android/print/PrinterCapabilitiesInfo.java:122: error: Replaced Kitkat with KitKat in documentation for Method android.print.PrinterCapabilitiesInfo.Builder.setColorModes(int, int) [Typo]
144
145* Built-in support for injecting new annotations for use by the Kotlin compiler,
146  not just nullness annotations found in the source code and annotations merged
147  in from external sources, but also inferring whether nullness annotations have
148  recently changed and if so marking them as @Migrate (which lets the Kotlin
149  compiler treat errors in the user code as warnings instead of errors.)
150
151* Support for generating documentation into the stubs files (so we can run
152  javadoc or [Dokka](https://github.com/Kotlin/dokka) on the stubs files instead
153  of the source code). This means that the documentation tool itself does not
154  need to be able to figure out which parts of the source code is included in
155  the API and which one is implementation; it is simply handed the filtered API
156  stub sources that include documentation.
157
158* Support for parsing Kotlin files. API files can now be implemented in Kotlin
159  as well and metalava will parse and extract API information from them just as
160  is done for Java files.
161
162* Like doclava1, metalava can diff two APIs and warn about API compatibility
163  problems such as removing API elements. Metalava adds new warnings around
164  nullness, such as attempting to change a nullness contract incompatibly
165  (e.g. you can change a parameter from non null to nullable for final classes,
166  but not versa).  It also lets you diff directly on a source tree; it does not
167  require you to create two signature files to diff.
168
169* Consistent stubs: In doclava1, the code which iterated over the API and
170  generated the signature files and generated the stubs had diverged, so there
171  was some inconsistency. In metalava the stub files contain **exactly** the
172  same signatures as in the signature files.
173
174  (This turned out to be incredibly important; this revealed for example that
175  StringBuilder.setLength(int) was missing from the API signatures since it is a
176  public method inherited from a package protected super class, which the API
177  extraction code in doclava1 missed, but accidentally included in the SDK
178  anyway since it packages package private classes. Metalava strictly applies
179  the exact same API as is listed in the signature files, and once this was
180  hooked up to the build it immediately became apparent that it was missing
181  important methods that should really be part of the API.)
182
183* API Lint: Metalava can optionally (with --api-lint) run a series of additional
184  checks on the public API in the codebase and flag issues that are discouraged
185  or forbidden by the Android API Council; there are currently around 80 checks.
186  Some of these take advantage of looking at the source code which wasn't
187  possible with the signature-file based Python version; for example, it looks
188  inside method bodies to see if you're synchronizing on this or the current
189  class, which is forbidden.
190
191* Baselines: Metalava can report all of its issues into a "baseline" file, which
192  records the current set of issues. From that point forward, when metalava
193  finds a problem, it will only be reported if it is not already in the
194  baseline.  This lets you enforce new issues going forward without having to
195  fix all existing violations. Periodically, as older issues are fixed, you can
196  regenerate the baseline. For issues with some false positives, such as API
197  Lint, being able to check in the set of accepted or verified false positives
198  is quite important.
199
200* Metalava can generate reports about nullness annotation coverage (which helps
201  target efforts since we plan to annotate the entire API). First, it can
202  generate a raw count:
203
204        Nullness Annotation Coverage Statistics:
205        1279 out of 46900 methods were annotated (2%)
206        2 out of 21683 fields were annotated (0%)
207        2770 out of 47492 parameters were annotated (5%)
208
209  More importantly, you can also point it to some existing compiled applications
210  (.class or .jar files) and it will then measure the annotation coverage of the
211  APIs used by those applications. This lets us target the most important APIs
212  that are currently used by a corpus of apps and target our annotation efforts
213  in a targeted way. For example, running the analysis on the current version of
214  framework, and pointing it to the
215  [Plaid](https://github.com/nickbutcher/plaid) app's compiled output with
216
217      ... --annotation-coverage-of ~/plaid/app/build/intermediates/classes/debug
218
219  This produces the following output:
220
221    324 methods and fields were missing nullness annotations out of 650 total
222    API references.  API nullness coverage is 50%
223
224    ```
225    | Qualified Class Name                                         |      Usage Count |
226    |--------------------------------------------------------------|-----------------:|
227    | android.os.Parcel                                            |              146 |
228    | android.view.View                                            |              119 |
229    | android.view.ViewPropertyAnimator                            |              114 |
230    | android.content.Intent                                       |              104 |
231    | android.graphics.Rect                                        |               79 |
232    | android.content.Context                                      |               61 |
233    | android.widget.TextView                                      |               53 |
234    | android.transition.TransitionValues                          |               49 |
235    | android.animation.Animator                                   |               34 |
236    | android.app.ActivityOptions                                  |               34 |
237    | android.view.LayoutInflater                                  |               31 |
238    | android.app.Activity                                         |               28 |
239    | android.content.SharedPreferences                            |               26 |
240    | android.content.SharedPreferences.Editor                     |               26 |
241    | android.text.SpannableStringBuilder                          |               23 |
242    | android.view.ViewGroup.MarginLayoutParams                    |               21 |
243    | ... (99 more items                                           |                  |
244    ```
245
246Top referenced un-annotated members:
247
248    ```
249    | Member                                                       |      Usage Count |
250    |--------------------------------------------------------------|-----------------:|
251    | Parcel.readString()                                          |               62 |
252    | Parcel.writeString(String)                                   |               62 |
253    | TextView.setText(CharSequence)                               |               34 |
254    | TransitionValues.values                                      |               28 |
255    | View.getContext()                                            |               28 |
256    | ViewPropertyAnimator.setDuration(long)                       |               26 |
257    | ViewPropertyAnimator.setInterpolator(android.animation.Ti... |               26 |
258    | LayoutInflater.inflate(int, android.view.ViewGroup, boole... |               23 |
259    | Rect.left                                                    |               22 |
260    | Rect.top                                                     |               22 |
261    | Intent.Intent(android.content.Context, Class<?>)             |               21 |
262    | Rect.bottom                                                  |               21 |
263    | TransitionValues.view                                        |               21 |
264    | VERSION.SDK_INT                                              |               18 |
265    | Context.getResources()                                       |               18 |
266    | EditText.getText()                                           |               18 |
267    | ... (309 more items                                          |                  |
268    ```
269
270  From this it's clear that it would be useful to start annotating
271  android.os.Parcel and android.view.View for example where there are
272  unannotated APIs that are frequently used, at least by this app.
273
274* Built on top of a full, type-resolved AST. Doclava1 was integrated with
275  javadoc, which meant that most of the source tree was opaque. Therefore, as
276  just one example, the code which generated documentation for typedef constants
277  had to require the constants to all share a single prefix it could look
278  for. However, in metalava, annotation references are available at the AST
279  level, so it can resolve references and map them back to the original field
280  references and include those directly.
281
282* Support for extracting annotations. Metalava can also generate the external
283  annotation files needed by Studio and lint in Gradle, which captures the
284  typedefs (@IntDef and @StringDef classes) in the source code. Prior to this
285  this was generated manually via the development/tools/extract code. This also
286  merges in manually curated data; some of this is in the manual/ folder in this
287  project.
288
289* Support for extracting API levels (api-versions.xml). This was generated by
290  separate code (tools/base/misc/api-generator), invoked during the build. This
291  functionality is now rolled into metalava, which has one very important
292  attribute: metalava will use this information when recording API levels for
293  API usage. (Prior to this, this was based on signature file parsing in
294  doclava, which sometimes generated incorrect results. Metalava uses the
295  android.jar files themselves to ensure that it computes the exact available
296  SDK data for each API level.)
297
298* Misc other features. For example, if you use the @VisibleForTesting annotation
299  from the support library, where you can express the intended visibility if the
300  method had not required visibility for testing, then metalava will treat that
301  method using the intended visibility instead when generating signature files
302  and stubs.
303
304## Architecture & Implementation
305
306Metalava is implemented on top of IntelliJ parsing APIs (PSI and UAST). However,
307these are hidden behind a "model": an abstraction layer which only exposes high
308level concepts like packages, classes and inner classes, methods, fields, and
309modifier lists (including annotations).
310
311This is done for multiple reasons:
312
313(1) It allows us to have multiple "back-ends": for example, metalava can read in
314    a model not just from parsing source code, but from reading older SDK
315    android.jar files (e.g. backed by bytecode) or reading previous signature
316    files.  Reading in multiple versions of an API lets doclava perform
317    "diffing", such as warning if an API is changing in an incompatible way. It
318    can also generate signature files in the new format (including data that was
319    missing in older signature files, such as annotation methods) without having
320    to parse older source code which may no longer be easy to parse.
321
322(2) There's a lot of logic for deciding whether code found in the source tree
323    should be included in the API. With the model approach we can build up an
324    API and for example mark a subset of its methods as included. By having a
325    separate hierarchy we can easily perform this work once and pass around our
326    filtered model instead of passing around PsiClass and PsiMethod instances
327    and having to keep the filtered data separately and remembering to always
328    consult the filter, not the PSI elements directly.
329
330The basic API element class is "Item". (In doclava1 this was called a
331"DocInfo".)  There are several sub interfaces of Item: PackageItem, ClassItem,
332MemberItem, MethodItem, FieldItem, ParameterItem, etc. And then there are
333several implementation hierarchies: One is PSI based, where you point metalava
334to a source tree or a .jar file, and it constructs Items built on top of PSI:
335PsiPackageItem, PsiClassItem, PsiMethodItem, etc. Another is textual, based on
336signature files: TextPackageItem, TextClassItem, and so on.
337
338The "Codebase" class captures a complete API snapshot (including classes that
339are hidden, which is why it's called a "Codebase" rather than an "API").
340
341There are methods to load codebases - from source folders, from a .jar file,
342from a signature file. That's how API diffing is performed: you load two
343codebases (from whatever source you want, typically a previous API signature
344file and the current set of source folders), and then you "diff" the two.
345
346There are several key helpers that help with the implementation, detailed next.
347
348### Visiting Items
349
350First, metalava provides an ItemVisitor. This lets you visit the API easily.
351For example, here's how you can visit every class:
352
353    coebase.accept(object : ItemVisitor() {
354        override fun visitClass(cls: ClassItem) {
355            // code operating on the class here
356        }
357    })
358
359Similarly you can visit all items (regardless of type) by overriding
360`visitItem`, or to specifically visit methods, fields and so on overriding
361`visitPackage`, `visitClass`, `visitMethod`, etc.
362
363There is also an `ApiVisitor`. This is a subclass of the `ItemVisitor`, but
364which limits itself to visiting code elements that are part of the API.
365
366This is how for example the SignatureWriter and the StubWriter are both
367implemented: they simply extend `ApiVisitor`, which means they'll only export
368the API items in the codebase, and then in each relevant method they emit the
369signature or stub data:
370
371    class SignatureWriter(
372            private val writer: PrintWriter,
373            private val generateDefaultConstructors: Boolean,
374            private val filter: (Item) -> Boolean) : ApiVisitor(
375            visitConstructorsAsMethods = false) {
376
377    ....
378
379    override fun visitConstructor(constructor: ConstructorItem) {
380        writer.print("    ctor ")
381        writeModifiers(constructor)
382        writer.print(constructor.containingClass().fullName())
383        writeParameterList(constructor)
384        writeThrowsList(constructor)
385        writer.print(";\n")
386    }
387
388    ....
389
390### Visiting Types
391
392There is a `TypeVisitor` similar to `ItemVisitor` which you can use to visit all
393types in the codebase.
394
395When computing the API, all types that are included in the API should be
396included (e.g. if `List<Foo>` is part of the API then `Foo` must be too).  This
397is easy to do with the `TypeVisitor`.
398
399### Diffing Codebases
400
401Another visitor which helps with implementation is the ComparisonVisitor:
402
403    open class ComparisonVisitor {
404        open fun compare(old: Item, new: Item) {}
405        open fun added(item: Item) {}
406        open fun removed(item: Item) {}
407
408        open fun compare(old: PackageItem, new: PackageItem) { }
409        open fun compare(old: ClassItem, new: ClassItem) { }
410        open fun compare(old: MethodItem, new: MethodItem) { }
411        open fun compare(old: FieldItem, new: FieldItem) { }
412        open fun compare(old: ParameterItem, new: ParameterItem) { }
413
414        open fun added(item: PackageItem) { }
415        open fun added(item: ClassItem) { }
416        open fun added(item: MethodItem) { }
417        open fun added(item: FieldItem) { }
418        open fun added(item: ParameterItem) { }
419
420        open fun removed(item: PackageItem) { }
421        open fun removed(item: ClassItem) { }
422        open fun removed(item: MethodItem) { }
423        open fun removed(item: FieldItem) { }
424        open fun removed(item: ParameterItem) { }
425    }
426
427This makes it easy to perform API comparison operations.
428
429For example, metalava has a feature to mark "newly annotated" nullness
430annotations as migrated. To do this, it just extends `ComparisonVisitor`,
431overrides the `compare(old: Item, new: Item)` method, and checks whether the old
432item has no nullness annotations and the new one does, and if so, also marks the
433new annotations as @Migrate.
434
435Similarly, the API Check can simply override
436
437    open fun removed(item: Item) {
438        reporter.report(error, item, "Removing ${Item.describe(item)} is not allowed")
439    }
440
441to flag all API elements that have been removed as invalid (since you cannot
442remove API. (The real check is slightly more complicated; it looks into the
443hierarchy to see if there still is an inherited method with the same signature,
444in which case the deletion is allowed.))
445
446### Documentation Generation
447
448As mentioned above, metalava generates documentation directly into the stubs
449files, which can then be processed by Dokka and Javadoc to generate the same
450docs as before.
451
452Doclava1 was integrated with javadoc directly, so the way it generated metadata
453docs (such as documenting permissions, ranges and typedefs from annotations) was
454to insert auxiliary tags (`@range`, `@permission`, etc) and then this would get
455converted into English docs later via `macros_override.cs`.
456
457This it not how metalava does it; it generates the English documentation
458directly. This was not just convenient for the implementation (since metalava
459does not use javadoc data structures to pass maps like the arguments for the
460typedef macro), but should also help Dokka -- and arguably the Kotlin code which
461generates the documentation is easier to reason about and to update when it's
462handling loop conditionals. (As a result I for example improved some of the
463grammar, e.g. when it's listing a number of possible constants the conjunction
464is usually "or", but if it's a flag, the sentence begins with "a combination of
465" and then the conjunction at the end should be "and").
466