chore: improve providence docs (#2408)

* chore: improve providence docs

* Apply suggestions from code review
This commit is contained in:
Thijs Louisse 2024-11-12 17:10:06 +01:00 committed by GitHub
parent faca1916fc
commit a19096e703
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
11 changed files with 48 additions and 1636 deletions

View file

@ -0,0 +1,5 @@
---
'providence-analytics': patch
---
improve docs

View file

@ -1,68 +0,0 @@
# Node Tools >> Providence Analytics >> Local configuration ||40
The Providence configuration file is read by providence cli (optional) and by the dashboard (required).
It has a few requirements:
- it must be called `providence.conf.js` or `providence.conf.mjs`
- it must be in ESM format
- it must be located in the root of a repository (under `process.cwd()`)
## Meta data
### Category info
Based on the filePath of a result, a category can be added.
For example:
```js
export default {
metaConfig: {
categoryConfig: [
{
// This is the name found in package.json
project: '@lion/ui/root.js',
// These conditions will be run on overy filePath
categories: {
core: p => p.startsWith('./packages/core'),
utils: p => p.startsWith('./packages/ajax') || p.startsWith('./packages/localize'),
overlays: p =>
p.startsWith('./packages/overlays') ||
p.startsWith('./packages/dialog') ||
p.startsWith('./packages/tooltip'),
...
},
},
],
},
}
```
> N.B. category info is regarded as subjective, therefore it's advised to move this away from
> Analyzers (and thus file-system cache). Categories can be added realtime in the dashboard.
## Project paths
### referenceCollections
A list of file system paths. They can be defined relative from the current project root or they can be full paths.
When a [MatchAnalyzer](../../../fundamentals/node-tools/providence-analytics/analyzer.md) like `match-imports` or `match-subclasses` is used, the default reference(s) can be configured here. For instance: ['/path/to/@lion/ui/form.js']
An example:
```js
referenceCollections: {
// Our products
'lion-based-ui': [
'./providence-input-data/references/lion-based-ui',
'./providence-input-data/references/lion-based-ui-labs',
],
...
}
```
### searchTargetCollections
A list of file system paths. They can be defined relative from the current project root
or they can be full paths.
When not defined, the current project will be the search target (this is most common when
providence is used as a dev dependency).

View file

@ -1,111 +0,0 @@
# Node Tools >> Providence Analytics >> QueryResult ||50
When an Analyzer has run, it returns a QueryResult. This is a json object that contains all
meta info (mainly configuration parameters) and the query output.
A QueryResult always contains the analysis of one project (a target project). Optionally,
it can contain a reference project as well.
## Anatomy
A QueryResult starts with a meta section, followed by the actual results
### Meta
The meta section lists all configuration options the analyzer was run with. Here, you see an
example of a `find-imports` QueryResult:
```js
"meta": {
"searchType": "ast-analyzer",
"analyzerMeta": {
"name": "find-imports",
"requiredAst": "babel",
"identifier": "importing-target-project_0.0.2-target-mock__1970011674",
"targetProject": {
"name": "importing-target-project",
"commitHash": "3e5014d6ecdff1fc71138cdb29aaf7bf367588f5",
"version": "0.0.2-target-mock"
},
"configuration": {
"keepInternalSources": false
}
}
},
```
### Output
The output is usually more specifically tied to the Analyzer. What most regular Analyzers
(not being MatchAnalyzers that require a referenceProjectPath) have in common, is that their
results are being shown per "entry" (an entry corresponds with an AST generated by Babel, which in
turn corresponds to a file found in a target or reference project).
Below an example is shown of `find-imports` QueryOutput:
```js
"queryOutput": [
{
"project": {
"name": "importing-target-project",
"mainEntry": "./target-src/match-imports/root-level-imports.js",
"version": "0.0.2-target-mock",
"commitHash": "3e5014d6ecdff1fc71138cdb29aaf7bf367588f5"
},
"entries": [
{
"file": "./target-src/find-imports/all-notations.js",
"result": [
{
"importSpecifiers": [
"[file]"
],
"source": "imported/source",
"normalizedSource": "imported/source",
"fullSource": "imported/source"
},
{
"importSpecifiers": [
"[default]"
],
"source": "imported/source-a",
"normalizedSource": "imported/source-a",
"fullSource": "imported/source-a"
},
...
```
MatchAnalyzers usually do post processing on the entries. The output below (for the `match-imports`
Analyzer) shows an ordering by matched specifier.
```js
"queryOutput": [
{
"exportSpecifier": {
"name": "[default]",
"project": "exporting-ref-project",
"filePath": "./index.js",
"id": "[default]::./index.js::exporting-ref-project"
},
"matchesPerProject": [
{
"project": "importing-target-project",
"files": [
"./target-src/match-imports/root-level-imports.js",
"./target-src/match-subclasses/internalProxy.js"
]
}
]
},
...
```
Due to some legacy decisions, the QueryOutput allows for multiple target- and reference projects.
Aggregation of data now takes place in the dashboard.
QueryOutputs always contain one or a combination of two projects. This means that the
QueryOutput structure could be simplified in the future.
## Environment agnosticism
The output files stored in the file system always need to be machine independent:
this means that all machine specific information, like a complete filepath, needs to be removed from a QueryOutput (paths relative from project root are still allowed).
In that way, the caching mechanism (based on hash comparisons) as described in [Analyzer](../../../fundamentals/node-tools/providence-analytics/analyzer.md) is guaranteed to work across different machines.

View file

@ -1,69 +0,0 @@
# Node Tools >> Providence Analytics >> Analyzer ||20
Analyzers form the core of Providence. They contain predefined queries based on AST traversal/analysis.
A few examples are:
- find-imports
- find-exports
- match-imports
An analyzer will give back a [QueryResult](../../../fundamentals/node-tools/providence-analytics/QueryResult.md) that will be written to the file system by Providence.
All analyzers need to extend from the `Analyzer` base class, found in `src/program/analyzers/helpers`.
## Public api
Providence has the following configuration api:
- name (string)
- requiresReference (boolean)
An analyzer will always need a targetProjectPath and can optionally have a referenceProjectPath.
In the latter case, it needs to have `requiresReference: true` configured.
During AST traversal, the following api can be consulted
- `.targetData`
- `.referenceData`
- `.identifier`
## Phases
### Prepare phase
In this phase, all preparations will be done to run the analysis. Providence is designed to be performant and therefore will first look if it finds an already existing, cached result for the current setup.
### Traverse phase
The ASTs are created for all projects involved and the data are extracted into a QueryOutput. This output can optionally be post processed.
### Finalize phase
The data are normalized and written to the filesystem in JSON format.
## Targets and references
Every Analyzer needs a targetProjectPath. A targetProjectPath is a file path String that.
## Types
We can roughly distinguish two types of analyzers: those that require a reference and those that don't require a reference.
## Database
In order to share data across multiple machines, results are written to the filesystem in a
"machine agnostic" way. They can be shared through git and serve as a local database.
### Caching
In order to make caching possible, Providence creates an "identifier": a hash from the combination of project versions + Analyzer configuration. When an identifier already exists in the filesystem, the result can be read from cache. This increases performance and helps mitigate memory problems that can occur when handling large amounts of data in a batch.
## Analyzer helpers
Inside the folder './src/program/analyzers', a folder 'helpers' is found.
Helpers are created specifically for use within analyzers and have knowledge about
the context of the analyzer (knowledge about an AST and/or QueryResult structure).
Generic functionality (that can be applied in any context) can be found in './src/program/utils'.
## Post processors
Post processors are imported by analyzers and act on their outputs. They can be enabled via the configuration of an analyzer. They can be found in './src/program/analyzers/post-processors'. For instance: transform the output of analyzer 'find-imports' by sorting on specifier instead of the default (entry). Other than most configurations of analyzers, post processors act on the total result of all analyzed files instead of just one file/ ast entry.

Binary file not shown.

Before

Width:  |  Height:  |  Size: 7.4 MiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 6.4 MiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 918 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 4 MiB

View file

@ -1,22 +0,0 @@
# Node Tools >> Providence Analytics >> Dashboard ||30
An interactive overview of all aggregated [QueryResults](../../../fundamentals/node-tools/providence-analytics/QueryResult.md) can be found in the dashboard.
The dashboard is a small nodejs server (based on es-dev-server + middleware) and a frontend
application.
## Run
Start the dashboard via `providence dashboard` to automatically open the browser and start the dashboard.
## Interface
- Select all reference projects
- Select all target projects
### Generate csv
When `get csv` is pressed, a `.csv` will be downloaded that can be loaded into Excel.
## Analyzer support
Currently, `match-imports` and `match-subclasses` are supported, more analyzers will be added in the future.

View file

@ -2,7 +2,6 @@
```js script ```js script
import { html } from '@mdjs/mdjs-preview'; import { html } from '@mdjs/mdjs-preview';
import { providenceFlowSvg, providenceInternalFlowSvg } from './assets/_mermaid.svg.js';
``` ```
Providence is the 'All Seeing Eye' that generates usage statistics by analyzing code. Providence is the 'All Seeing Eye' that generates usage statistics by analyzing code.
@ -10,8 +9,6 @@ It measures the effectivity and popularity of your software.
With just a few commands you can measure the impact for (breaking) changes, making With just a few commands you can measure the impact for (breaking) changes, making
your release process more stable and predictable. your release process more stable and predictable.
Providence can be used as a dev dependency in a project for which metrics
can be generated via analyzers (see below).
For instance for a repo "lion-based-ui" that extends @lion/\* we can answer questions like: For instance for a repo "lion-based-ui" that extends @lion/\* we can answer questions like:
- **Which subsets of my product are popular?** - **Which subsets of my product are popular?**
@ -23,170 +20,76 @@ For instance for a repo "lion-based-ui" that extends @lion/\* we can answer ques
- etc... - etc...
All the above results can be shown in a dashboard (see below), which allows to sort exports from reference project (@lion) based on popularity, category, consumer etc. Providence uses abstract syntax trees (ASTs) to have the most advanced analysis possible.
The dashboard allows to aggregate data from many target projects as well and will show you on a detailed (file) level how those components are being consumed by which projects. It does this via the [oxc parser](https://oxc.rs/docs/guide/usage/parser.html), the quickest parser available today!
## Setup ## Run
### Install providence Providence expects an analyzer name that tells it what type of analysis to run:
```bash ```bash
npm i --save-dev providence-analytics npx providence analyze <analyzer-name>
``` ```
### Add a providence script to package.json By default Providence ships these analyzers:
```json
"scripts": {
"providence:match-imports": "providence analyze match-imports -r 'node_modules/@lion/ui/*.js'",
}
```
> The example above illustrates how to run the "match-imports" analyzer for reference project 'lion-based-ui'. Note that it is possible to run other analyzers and configurations supported by providence as well. For a full overview of cli options, run `npx providence --help`. All supported analyzers will be viewed when running `npx providence analyze`
You are now ready to use providence in your project. All
data will be stored in json files in the folder `./providence-output`
![CLI](./assets/provicli.gif 'CLI')
## Setup: Dashboard
### Add "providence:dashboard" script to package.json
```js
...
"scripts": {
...
"providence:dashboard": "providence dashboard"
}
```
### Add providence.conf.js
```js
export default {
referenceCollections: {
'lion-based-ui-collection': [
'./node_modules/lion-based-ui/packages/x',
'./node_modules/lion-based-ui/packages/y',
],
},
};
```
Run `npm run providence:dashboard`
![dashboard](./assets/providash.gif 'dashboard')
## Setup: about result output
All output files will be stored in `./providence-output`.
This means they will be committed to git, so your colleagues don't have to
rerun the analysis (for large projects with many dependencies this can be time consuming)
and can directly start the dashboard usage metrics.
Also, note that the files serve as cache (they are stored with hashes based on project version and analyzer configuration). This means that an interrupted analysis can be
resumed later on.
## Conceptual overview
Providence performs queries on one or more search targets.
These search targets consist of one or more software projects (javascript/html repositories)
The diagram below shows how `providenceMain` function can be used from an external context.
```js story
export const providenceFlow = () => providenceFlowSvg;
```
## Flow inside providence
The diagram below depicts the flow inside the `providenceMain` function.
It uses:
- InputDataService
Used to create a data structure based on a folder (for instance the search target or
the references root). The structure creates entries for every file, which get enriched with code,
ast results, query results etc. Returns `InputData` object.
- QueryService
Requires a `queryConfig` and `InputData` object. It will perform a query (grep search or ast analysis)
and returns a `QueryResult`.
It also contains helpers for the creation of a `queryConfig`
- ReportService
The result gets outputted to the user. Currently, a log to the console and/or a dump to a json file
are available as output formats.
```js story
export const providenceInternalFlow = () => providenceInternalFlowSvg;
```
## Queries
Providence requires a queries as input.
Queries are defined as objects and can be of two types:
- feature-query
- ast-analyzer
A `QueryConfig` is required as input to run the `providenceMain` function.
This object specifies the type of query and contains the relevant meta
information that will later be outputted in the `QueryResult` (the JSON object that
the `providenceMain` function returns.)
## Analyzer Query
Analyzer queries are also created via `QueryConfig`s.
Analyzers can be described as predefined queries that use AST traversal.
Run:
```bash
providence analyze
```
Now you will get a list of all predefined analyzers:
- find-imports - find-imports
- find-exports - find-exports
- find-classes - find-classes
- match-imports - match-imports
- match-subclasses - match-subclasses
- etc...
![Analyzer query](./assets/analyzer-query.gif 'Analyzer query') Let's say we run `find-imports`:
<!--
## Running providence from its own repo
### How to add a new search target project
```bash ```bash
git submodule add <git-url> ./providence-input-data/search-targets/<project-name> npx providence analyze find-imports
``` ```
### How to add a reference project Now it retrieves all relevant data about es module imports.
There are plenty of edge cases that it needs to take into account here;
you can have a look at the tests to get an idea about all different cases Providence handles for you.
By adding a reference project, you can automatically see how code in your reference project is ## Projects
used across the search target projects.
Under the hood, this automatically creates a set of queries for you. Providence uses the concept of projects. A project is a piece of software to analyze:
usually an npm artifact or a git (mono-)repository. What all projects have in common,
is a package.json. From it, the following project data is derived:
- the name
- the version
- the files it uses for scanning. One of the following strategies is usually followed:
- exportmap entrypoints (by 'expanding' package.json "exports" on file system)
- npm files (it reads package.json "files" | .npmignore)
- the git files (it reads .gitignore)
- a custom defined list
For a "find" analyzer, there is one project involved (the target project).
We can specify it like this (we override the default current working directory):
```bash ```bash
git submodule add <git-url> ./providence-input-data/references/<project-name> npx providence analyze find-imports -t /importing/project
``` ```
### Updating submodules For a "match" analyzer, there is also a reference project.
Here we match the exports of the reference project (-r) against the imports of the target project (-t).
Please run:
```bash ```bash
git submodule update --init --recursive npx providence analyze match-imports -t /importing/project -r /exporting/project
``` ```
### Removing submodules ## Utils
Please run: Providence comes with many tools for deep traversal of identifiers,
the (babel like) traversal of ast trees in oxc and swc and more.
Also more generic utils for caching and performant globbing come delivered with Providence.
For a better understanding, check out the utils folders (tests and code).
## More
For more options, see:
```bash ```bash
sh ./rm-submodule.sh <path/to/submodule> npx providence --help
``` ```
-->