# Use of random barcode in data analysis

**I am interested in the evaluation of random barcodes, which I can't completely understand. The barcode marks individual cDNA, but how can the barcode solve the problem of PCR artefacts? For example, if there are two barcode at a particular position, one barcode having 100 reads, another barcode having 200 reads, then the total reads for both barcodes should be the same if PCR efficiency is the same for all cDNAs, so you should choose the minimal number of reads, i.e., 100 reads. This is my guessing, I don't know if it's correct.**

A good example of random barcode analysis is the Fig 1C in “iCLIP Predicts the Dual Splicing Effects of TIA-RNA Interactions by Wang et al, PLOS biology, 2010”. This shows you the random barcode for each sequence, and the number of sequences that had the same barcode is shown in the brackets. If multiple sequences mapping to the same position in the genome have the same random barcode, then they are all counted as 1. In your example, you have only two different random barcodes or sequences mapping to the same position, so the cDNA count = 2. Such analysis can properly correct for PCR artefacts.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://clipforum.flow.bio/computational/use-of-random-barcode-in-data-analysis.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
