## Materials and Methods  

### Sample Information and Experimental Design  
Six RNA‑seq libraries derived from mouse embryonic stem cells were processed in this study. The libraries were classified according to differentiation status: three undifferentiated samples (S1_undiff, S2_undiff, S3_undiff) and three differentiated samples (S4_diff, S5_diff, S6_diff). All libraries were generated as single‑end reads; consequently the `paired` option was set to **false** throughout the quality‑control workflow. The samples were listed in the project‑wide input manifest (`input_dataset.tsv`) located at  

```
/srv/GT/analysis/course_sushi/public/gstore/projects/p1001/Fastqc_2025-11-25--15-07-30/input_dataset.tsv
```  

### Computational Environment  
Analyses were executed on a compute node of the *course* partition of the NGSEQ cluster. The job requested 8 CPU cores, 15 GB of RAM and 100 GB of local scratch space. The software environment was loaded via the Lmod module system:

| Tool | Version |
|------|---------|
| FastQC | 0.12.1 |
| fastp | 0.23.4 |
| Picard | 3.2.0 |
| samtools | 1.20 |
| R | 4.5.0 |
| ezRun (R package) | latest from NGSEQ repository |

All analyses were performed under a `umask 0002` setting to ensure group‑readable output files.

### Data Processing Pipeline  

#### FastQC Quality Assessment  
The FastQC application (`QC/FastQC/0.12.1`) was invoked through the NGSEQ wrapper **EzAppFastqc**. The R script executed the following steps:

1. **Input parsing** – the manifest file supplied the absolute paths to the raw FASTQ files for the six samples.  
2. **FastQC execution** – each FASTQ file was processed individually with default FastQC parameters; the `showNativeReports` flag was disabled (`false`) to suppress interactive HTML generation for each sample.  
3. **Result aggregation** – FastQC output directories for all samples were copied to  

```
p1001/Fastqc_2025-11-25--15-07-30/FastQC
```  

#### MultiQC Summary  
After per‑sample FastQC runs, the **MultiQC** tool (version bundled with the FastQC module) was used to collate all FastQC reports into a single interactive HTML document. The aggregated report was stored at  

```
p1001/Fastqc_2025-11-25--15-07-30/multi_FastQC/multiqc_report.html
```  

The MultiQC directory was synchronized to the permanent project storage area using `rsync`.

### Quality Control Measures  

- **Per‑sample diagnostics** – FastQC provided standard metrics (per‑base quality scores, GC content, sequence duplication levels, over‑represented sequences, and adapter content).  
- **Global overview** – MultiQC summarized these metrics across the six libraries, enabling rapid identification of outlier samples or systematic biases.  
- **Reproducibility** – All command‑line invocations, module versions, and R parameters were captured in the job script and the R environment (`EZ_GLOBAL_VARIABLES.txt`). The temporary scratch directory (`/scratch/Fastqc_2025-11-25--15-07-30_temp$$`) was removed after successful completion to prevent data leakage.

### Data Availability  

The final FastQC result folder (`FastQC`) and the MultiQC report (`multiqc_report.html`) are available in the project repository under  

```
/srv/GT/analysis/course_sushi/public/gstore/projects/p1001/Fastqc_2025-11-25--15-07-30/
```  

These files constitute the complete quality‑control dataset for the six RNA‑seq libraries and can be accessed for downstream analyses.  