> For the complete documentation index, see [llms.txt](https://brightwind.gitbook.io/brighthub-user-docs/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://brightwind.gitbook.io/brighthub-user-docs/data-monitoring-guide/clean-data-on-brighthub.md).

# Clean Data on BrightHub

### Introduction

BrightHub contains a powerful cleaning tool that allows for the inspection and cleaning of any measurement's raw timeseries data. This tool allows for manual and automatic cleaning, through use of cleaning logs and user defined cleaning rules.

The key features of the BrightHub cleaning tab include:

* User defined interactive cleaning plots, both timeseries and scatter.
* Plotting reference measurements such as reanalysis data or a nearby station.
* A cleaning log interface, allowing download/upload of a station cleaning log.
* Cleaning rules, allowing for automated cleaning of data depending on user defined rules..

For a more detailed description of the features in the cleaning tab, see [Reference User Manual](/brighthub-user-docs/reference-user-manual.md#cleaning-tab) in the Reference User Manual.&#x20;

Watch this video for a visual tutorial on cleaning data using the cleaning plots and cleaning log, or read on to learn how to add cleaning using the cleaning plots interface, add cleaning to the cleaning log and add automatic cleaning rules.&#x20;

***

### Video Tutorial

{% embed url="<https://youtu.be/I_EuNSeHz5U?si=DbQvFCC9oi0P2sdB>" %}

***

### Written Tutorial

**Before you start**

You must have the "Write" permission on a BrightHub station in order to edit / apply cleaning. A user with "Read All" access is able to view and download all station cleaning. To gain "Write" access, contact your organisation Admin.&#x20;

You should have set up the station as per the [Station Setup Guide](/brighthub-user-docs/station-setup-guide.md). It is not necessary to have automatic data retrieval set up in order to clean data, however the cleaning plots will update with any new data retrieved if this is set up.

**Cleaning plots**

BrightHub creates a group of default cleaning plots based on the station measurement points. It is then up to the user whether new plots are added or existing plots are edited.

There are two main options for manual cleaning of data, these are: selecting the data in the visual cleaning tool interface or manually inputting the cleaning using the cleaning log.

**Manual cleaning of data using existing plots**

To add cleaning to existing plots:

1. Select data from a plot using the box select button located within each cleaning plot and drag the box to select the data points you wish to clean.

   <figure><img src="/files/fICaNrpyPNA4r4BbLc2E" alt=""><figcaption></figcaption></figure>
2. Click the "Add Cleaning" button. Here you can select a reason for the cleaning, link the cleaning to an issue in the station [issue log](/brighthub-user-docs/reference-user-manual/issue-log.md) and select the measurements you wish to clean out.
3. Once the cleaning is added, the logs will appear in the station "Cleaning Log" below.

**Cleaning by manually modifying the cleaning log**

The cleaning log contains a record of all cleaning done to the station. To add any cleaning using this method:

1. Click the "Add Cleaning" button to add an entry to the cleaning log.
2. Enter the relevant information to specify the details of the cleaning entry. Here you must specify the measurement name for the cleaning to be applied too, the time range to clean out and the reason from the drop down list. As with cleaning in the cleaning tool, you can specify an issue to refer to.
3. Click save to apply the changes. The cleaning will be applied immediately and can then be viewed within the cleaning tool.

{% hint style="info" %}
Reminder: "Date From" in BrightHub is inclusive, while "Date To" is exclusive of the entered timestamp.
{% endhint %}

**Automated cleaning using BrightHub's Cleaning Rules**

If there are quantifiable conditions where data should be cleaned out, you can add a cleaning rule in BrightHub to automate the cleaning of the timeseries where the condition holds true.&#x20;

See the examples below for cleaning ZX LiDAR flags and implementing WindCube availability filters:

<details>

<summary>Cleaning Rule example - ZX LiDAR flags</summary>

Some logger types may include flags in the data indicating they should be filtered and removed from the dataset. For example, a ZX LiDAR uses 9999, 9998 and 9992 flags to indicate this. To remove these flags using BrightHub cleaning rules, you may add the following:

<figure><img src="/files/AF5bcMkgGTxmUVPIrtSo" alt=""><figcaption><p>ZX LiDAR flag cleaning rule example</p></figcaption></figure>

This rule will clean out all Spd\_100m measurements if the Spd\_100m measurement records a value greater than or equal to 9998 (effectively removing the 9998 or 9999 flags). Similar rules can be made for all other measurements by copying the cleaning rule and modifying the column name entry.&#x20;

</details>

<details>

<summary>Cleaning Rule example - WindCube availability filter</summary>

<figure><img src="/files/tddhFmFhVZnTpwRpRUY6" alt=""><figcaption><p>Cleaning Rule example for recommended WindCube availability filter</p></figcaption></figure>

This cleaning rule cleans out WindCube data when the data availability column name is less than 80. This means when the availability for the averaging period is less than 80%, data at that timestamp will not be used.

</details>

Cleaning rules can be uploaded and downloaded to/from BrightHub in a .json file format. See [Cleaning Tab](/brighthub-user-docs/reference-user-manual/cleaning-tab.md)for more information on its format.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://brightwind.gitbook.io/brighthub-user-docs/data-monitoring-guide/clean-data-on-brighthub.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
