OSS
THIS IS A DRAFT
This text may not be complete.
- title
- OSS Training Course
- author
- Lukasz Sokolowski
OSS
OSS Training Materials
Copyright Notice
Copyright © 2004-2026 by NobleProg Limited All rights reserved.
This publication is protected by copyright, and permission must be obtained from the publisher prior to any prohibited reproduction, storage in a retrieval system, or transmission in any form or by any means, electronic, mechanical, photocopying, recording, or likewise.
Introduction/Outline
- Code management, versioning, and licensing
- Automation and code quality (best practices)
- Continuous Integration (CI) on GitHub/GitLab
- Automated testing (unit, integration, end-to-end)
- Changelog (Keep a Changelog, Conventional Commits)
- Issue management and roadmap
- Best practices in issue creation (templates, labels, milestones)
- Documentation
- Effective README: objectives, installation, usage, contributions
- Contributing Guide (CONTRIBUTING.md)
- API documentation (Swagger, Sphinx, Docusaurus, etc.)
Main Keys/Concerns
From Andre's notes/suggestions (=
Open Source Software in Research: Strategy, Practice, and Impact
Open Source Software (OSS) as a strategic, technical, and scientific asset for research institutes, illustrated with real-world examples.
Slide 1 — The Strategic Role of OSS in a Research Institute
Open Source Software is foundational to modern research practice.
Benefits:
- Visibility of research outputs
- Increased scientific impact
- Collaboration across institutions
- Transparency and reproducibility
- Long-term sustainability
OSS functions as shared research infrastructure.
Slide 2 — Real-World Example: CERN
CERN treats software as a first-class research output.
Practices:
- Public repositories for core software (e.g., ROOT, Geant4)
- Strong open source licensing culture
- Long-term maintenance beyond individual projects
Impact:
- Software reused globally in physics and beyond
- Industrial and academic adoption
- Software cited alongside publications
Key lesson: Large-scale research infrastructures depend on open software.
Slide 3 — OSS and Publicly Funded Research
Open software aligns with public funding principles.
Key drivers:
- Open science mandates
- Reproducibility requirements
- Accountability to taxpayers
OSS ensures research results are:
- Verifiable
- Reusable
- Preserved beyond project lifetimes
Slide 4 — Real-World Example: European Commission & EOSC
The European Open Science Cloud (EOSC) promotes OSS.
Practices:
- Preference for open licenses
- FAIR principles applied to software
- Software recognized as a research output
Impact:
- Policy-level support for open software
- Alignment across national research infrastructures
Key lesson: OSS is increasingly embedded in research policy.
Slide 5 — Licensing Choices: Why They Matter
Licenses define legal reuse.
Without a license:
- Code cannot be reused
- Collaboration is legally blocked
Licensing must be intentional and documented.
Slide 6 — Real-World Example: NumPy & SciPy
NumPy and SciPy originated in academic research.
License choice:
- BSD (permissive)
Outcomes:
- Massive industrial and academic adoption
- Integration into commercial products
- Long-term sustainability via a broad community
Key lesson: Permissive licenses can maximize scientific reach.
Slide 7 — Permissive vs. Copyleft Licenses
Two main license families are common in research.
Permissive:
- MIT, BSD, Apache 2.0
- Fewer restrictions
- High reuse potential
Copyleft:
- GPL, LGPL
- Ensures openness of derivatives
- May limit industrial integration
Slide 8 — Real-World Example: GNU Scientific Software
GNU scientific tools use copyleft licenses.
License choice:
- GPL
Outcomes:
- Guaranteed openness of derivatives
- Strong alignment with free software principles
- Smaller but ideologically aligned ecosystem
Key lesson: Copyleft prioritizes openness over adoption scale.
Slide 9 — Minimum Best Practices for Publishing Research Software
Research software should meet baseline standards.
Required:
- Public repository
- Clear license
- Documentation
- Versioning
- Citation metadata
Quality enables reuse.
Slide 10 — Real-World Example: EMBL-EBI
EMBL-EBI publishes bioinformatics software openly.
Practices:
- Standardized repositories
- Clear documentation
- Explicit versioning and releases
Impact:
- Tools reused globally in life sciences
- Software cited in publications
- Long-lived community tools
Key lesson: Consistency scales reuse.
Slide 11 — Documentation as a Research Output
Documentation supports reproducibility.
Minimum documentation:
- Purpose and scope
- Installation
- Usage examples
- Limitations
Good documentation is an investment, not overhead.
Slide 12 — Versioning, Releases, and Citation
Stable versions enable scientific referencing.
Best practices:
- Semantic Versioning
- Git tags
- DOI assignment via Zenodo
- CITATION.cff file
Slide 13 — Real-World Example: Zenodo + GitHub
Many institutes integrate GitHub with Zenodo.
Practices:
- DOI minted for each release
- Software cited like a paper
- Version-specific references
Used by:
- CERN
- Universities
- EU-funded projects
Key lesson: Infrastructure exists — use it.
Slide 14 — Governance When Opening Internal Code
Open code requires explicit governance.
Key questions:
- Who reviews changes?
- Who releases software?
- Who resolves disputes?
Governance should be lightweight but explicit.
Slide 15 — Real-World Example: Apache Software Foundation
ASF provides a mature governance model.
Practices:
- Merit-based contributor model
- Clear maintainer roles
- Transparent decision-making
Impact:
- Sustainable projects
- Low institutional dependency
- Long-term continuity
Key lesson: Governance enables longevity.
Slide 16 — Managing External Contributions
External contributions need structure.
Best practices:
- Pull Requests only
- Mandatory reviews
- CI enforcement
- Code of Conduct
These practices protect both contributors and institutions.
Slide 17 — Positioning OSS as Scientific Impact
Software impact is measurable.
Indicators:
- Citations (DOIs)
- External contributors
- Downstream reuse
- Inclusion in workflows or infrastructures
Slide 18 — Real-World Example: Research Software as Impact
Examples:
- R language ecosystem (originated in academia)
- scikit-learn (academic origins, global adoption)
- Astropy (community-governed astronomy software)
Recognized impact:
- Thousands of citations
- Used in publications across disciplines
Key lesson: OSS can outlive individual projects.
Slide 19 — Technical Repository Management
Engineering practices support trust.
Minimum requirements:
- Stable main branch
- PR-based workflow
- Automated tests
- CI pipelines
- Release tagging
- Dependency management
Slide 20 — Real-World Example: NASA Open Source
NASA publishes and maintains OSS.
Practices:
- Mandatory open repositories
- Automated CI
- Clear contribution rules
Impact:
- External reuse
- Industry collaboration
- Increased transparency
Key lesson: Technical discipline enables openness at scale.
Slide 21 — Key Takeaways
- OSS is strategic research infrastructure
- Licensing shapes reuse and impact
- Minimum quality standards are essential
- Governance enables safe collaboration
- Software impact is measurable and reportable
- Automation sustains quality over time
Extended Details
Unfold it with the Expand button on the very right side below
Modern Software Development Practices (Python & JavaScript)
Best practices for managing, testing, and documenting software projects built with Python and JavaScript.
Code Management, Versioning, and Licensing
- Use Git for source control
- Branching strategy:
- main – stable production code
- feature branches – new development
- Use Semantic Versioning (MAJOR.MINOR.PATCH)
- Add a LICENSE file (MIT or Apache 2.0 commonly used)
- Protect main branches with:
- Pull / Merge Request reviews
- Mandatory CI checks
Automation and Code Quality (Python & JS)
Python
- Linters: flake8, pylint
- Formatter: black
- Import sorting: isort
- Type checking: mypy
JavaScript
- Linter: ESLint
- Formatter: Prettier
- Type checking: TypeScript (recommended)
Best practices:
- Run linters and formatters automatically
- Keep functions small and readable
- Follow PEP 8 (Python) and standard JS style guides
Continuous Integration (CI)
CI pipelines automatically validate code on each push or pull request.
Example: GitHub Actions
name: CI
on:
pull_request:
push:
branches: [ main ]
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.11"
- name: Install Python dependencies
run: |
pip install -r requirements.txt
- name: Lint Python
run: |
flake8 .
black --check .
- name: Run Python tests
run: pytest
- name: Set up Node.js
uses: actions/setup-node@v4
with:
node-version: "20"
- name: Install JS dependencies
run: npm ci
- name: Lint JS
run: npm run lint
- name: Run JS tests
run: npm test
Example: GitLab CI
stages:
- lint
- test
python_lint:
stage: lint
image: python:3.11
script:
- pip install flake8 black
- flake8 .
- black --check .
python_test:
stage: test
image: python:3.11
script:
- pip install -r requirements.txt
- pytest
js_lint:
stage: lint
image: node:20
script:
- npm ci
- npm run lint
js_test:
stage: test
image: node:20
script:
- npm ci
- npm test
Benefits:
- Early detection of issues
- Enforced quality standards
- Reliable and repeatable builds
Automated Testing
Python
- Frameworks: pytest, unittest
- Tools:
- pytest-cov (coverage)
- requests-mock / responses (API mocking)
JavaScript
- Unit & integration: Jest, Vitest
- End-to-end (E2E): Cypress, Playwright
Best practices:
- Run tests automatically in CI
- Test behavior, not implementation details
- Keep test execution fast
Changelog and Commit Standards
- Maintain CHANGELOG.md
- Follow Keep a Changelog structure:
- Added
- Changed
- Fixed
- Deprecated
Conventional Commits
- feat: new feature
- fix: bug fix
- docs: documentation
- test: tests
- chore: maintenance
Issue Management and Roadmap
- Use issues to track bugs, features, and technical debt
- Organize work using milestones and boards
- Reference issues in commits and merge requests
Best Practices in Issue Creation
- Use issue templates (bug / feature)
- Apply labels:
- python
- javascript
- bug
- enhancement
- documentation
- Always include clear reproduction steps for bugs
Documentation
Effective README
A strong README.md includes:
- Project overview
- Python / Node.js requirements
- Installation steps
- Usage examples
- Testing instructions
- License
Contributing Guide (CONTRIBUTING.md)
Should define:
- Environment setup
- Coding standards
- Commit conventions
- Pull Request workflow
API Documentation
Python
- Sphinx – documentation from docstrings
- FastAPI – automatic OpenAPI / Swagger
- MkDocs – lightweight docs
JavaScript
- Swagger / OpenAPI – REST APIs
- JSDoc – inline documentation
- Docusaurus – documentation portals
Recommended Project Structure
Example structure for a combined Python + JavaScript repository:
project-root/
├── backend/
│ ├── app/
│ │ ├── __init__.py
│ │ ├── main.py
│ │ ├── api/
│ │ └── services/
│ ├── tests/
│ ├── requirements.txt
│ └── pyproject.toml
│
├── frontend/
│ ├── src/
│ │ ├── components/
│ │ ├── pages/
│ │ └── services/
│ ├── tests/
│ ├── package.json
│ └── package-lock.json
│
├── docs/
│ ├── api/
│ └── guides/
│
├── .github/ or .gitlab/
│ └── ci/
│
├── CHANGELOG.md
├── CONTRIBUTING.md
├── README.md
└── LICENSE
Key Takeaways
- CI enforces quality for Python and JavaScript
- Automated testing reduces regressions
- Clear structure improves maintainability
- Documentation is part of the codebase
Open Source Best Practices (Python & JavaScript)
This section extends the project guidelines with patterns commonly used in successful open source projects.
Separate CI Pipelines per Service
In multi-service or monorepo projects, each service should have an independent CI pipeline.
Benefits:
- Faster CI execution
- Clear ownership per service
- Reduced coupling between frontend and backend
GitHub Actions (Per Service)
Each service has its own workflow file.
.github/workflows/
├── backend-ci.yml
└── frontend-ci.yml
Example: Backend CI
name: Backend CI
on:
push:
paths:
- "backend/**"
pull_request:
paths:
- "backend/**"
jobs:
backend:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: "3.11"
- run: pip install -r backend/requirements.txt
- run: flake8 backend
- run: pytest backend/tests
Example: Frontend CI
name: Frontend CI
on:
push:
paths:
- "frontend/**"
pull_request:
paths:
- "frontend/**"
jobs:
frontend:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: "20"
- run: cd frontend && npm ci
- run: cd frontend && npm run lint
- run: cd frontend && npm test
GitLab CI (Per Service)
backend:
stage: test
rules:
- changes:
- backend/**/*
image: python:3.11
script:
- pip install -r backend/requirements.txt
- pytest backend/tests
Monorepo vs Multirepo
Choosing the right repository strategy is critical for scalability.
| Aspect | Monorepo | Multirepo |
|---|---|---|
| Code location | Single repository | One repository per service |
| CI complexity | Higher | Lower |
| Dependency sharing | Easy | Requires versioning |
| Access control | Unified | Granular |
| Tooling | Requires advanced CI | Simpler |
| Open source friendliness | Good for small teams | Best for large ecosystems |
Recommendations:
- Monorepo – small teams, tight coupling, shared releases
- Multirepo – independent services, different release cycles, large communities
Issue Templates (Wiki Format)
Clear issue templates improve collaboration and contributor experience.
Bug Report
== Description == A clear and concise description of the bug. == Steps to Reproduce == # Step 1 # Step 2 # Step 3 == Expected Behavior == What you expected to happen. == Actual Behavior == What actually happened. == Environment == * OS: * Python / Node.js version: * Browser (if applicable): == Additional Context == Logs, screenshots, or links.
Feature Request
== Summary == Short description of the requested feature. == Motivation == Why is this feature needed? == Proposed Solution == Describe the preferred solution. == Alternatives == Other approaches considered. == Additional Context == Links, mockups, or references.
Documentation Issue
== Documentation Section == Which page or file needs improvement? == Problem == What is unclear, missing, or incorrect? == Suggested Improvement == Proposed text or structure.
Open Source Project Best Practices
These practices help attract and retain contributors.
Governance and Transparency
- Define maintainers and roles
- Use public roadmaps
- Make decisions in issues and PRs
Contribution Experience
- Clear README and CONTRIBUTING.md
- Friendly issue templates
- Label beginner issues (e.g. good first issue)
Licensing and Legal
- Always include a LICENSE file
- Ensure dependencies are license-compatible
- Avoid committing secrets or credentials
Community Standards
- Add a Code of Conduct (e.g. Contributor Covenant)
- Enforce respectful communication
- Moderate discussions consistently
Release Management
- Use semantic versioning
- Maintain a changelog
- Tag releases
- Automate releases where possible
Security
- Provide a SECURITY.md
- Define responsible disclosure process
- Keep dependencies up to date
Open Source Checklist
- README.md
- CONTRIBUTING.md
- CHANGELOG.md
- LICENSE
- CODE_OF_CONDUCT.md
- SECURITY.md
- CI pipelines enabled
- Issue and PR templates
Open Source Collaboration and Release Management
This section defines contribution workflows, security policies, community standards, and automated releases.
Pull Request Templates
Pull Request templates help reviewers and contributors align on expectations.
Pull Request Template (General)
## Description Brief summary of the changes introduced by this PR. ## Related Issue Closes #<issue-number> ## Type of Change - [ ] Bug fix - [ ] New feature - [ ] Documentation update - [ ] Refactoring - [ ] CI / tooling ## How Has This Been Tested? Describe the tests that you ran. ## Checklist - [ ] Code follows project style guidelines - [ ] Tests added or updated - [ ] Documentation updated (if applicable) - [ ] CI pipeline passes
Best practices:
- Require PR templates for all contributions
- Enforce reviews via branch protection
- Keep PRs small and focused
Security Policy (SECURITY.md)
Open source projects should clearly define how to report vulnerabilities.
Example SECURITY.md
# Security Policy ## Supported Versions Only the latest major version is actively supported with security updates. ## Reporting a Vulnerability If you discover a security vulnerability, please do NOT open a public issue. Instead, report it by emailing: security@project-domain.example Please include: - A description of the vulnerability - Steps to reproduce - Potential impact - Suggested remediation (if available) We aim to respond within 72 hours.
Best practices:
- Never discuss vulnerabilities publicly before a fix
- Acknowledge reporters responsibly
- Publish security advisories after resolution
Code of Conduct (CODE_OF_CONDUCT.md)
A Code of Conduct creates a safe and welcoming community.
Example CODE_OF_CONDUCT.md
# Code of Conduct ## Our Pledge We are committed to providing a respectful and inclusive environment for everyone. ## Expected Behavior - Be respectful and considerate - Use welcoming and inclusive language - Accept constructive criticism - Focus on what is best for the community ## Unacceptable Behavior - Harassment or discrimination - Trolling or personal attacks - Publishing private information ## Enforcement Project maintainers are responsible for enforcing this code of conduct. ## Reporting Report incidents to: conduct@project-domain.example
Recommendation:
- Use the Contributor Covenant as a base
- Enforce consistently and transparently
Release Automation (semantic-release)
Automated releases reduce human error and ensure consistency.
What semantic-release Does
- Determines next version from commit messages
- Generates changelog entries
- Creates Git tags and releases
- Publishes artifacts automatically
Commit Requirements
semantic-release requires Conventional Commits:
- feat: introduces a new feature (MINOR)
- fix: bug fix (PATCH)
- feat!: or BREAKING CHANGE (MAJOR)
Example semantic-release Configuration
{
"branches": ["main"],
"plugins": [
"@semantic-release/commit-analyzer",
"@semantic-release/release-notes-generator",
"@semantic-release/changelog",
"@semantic-release/github"
]
}
GitHub Actions: Automated Release
name: Release
on:
push:
branches:
- main
jobs:
release:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: "20"
- run: npm ci
- run: npx semantic-release
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
Python + semantic-release Notes
- semantic-release manages versions and tags
- Python packages should:
- Read version from git tags
- Or inject version during build (setuptools_scm)
Open Source Release Best Practices
- Use automated releases
- Never manually edit versions
- Always release from main branch
- Keep CHANGELOG.md generated automatically
- Tag every release
Final Open Source Readiness Checklist
- README.md
- CONTRIBUTING.md
- CHANGELOG.md
- LICENSE
- CODE_OF_CONDUCT.md
- SECURITY.md
- Issue templates
- Pull Request templates
- CI pipelines per service
- Automated releases enabled
Advanced Open Source Project Setup
This section completes the open source framework with publishing automation, governance, labeling standards, and repository structure.
Automated Package Publishing
Automated publishing ensures consistent, repeatable releases.
PyPI Publishing (Python)
Best practice:
- Publish only from tagged releases
- Use CI for trusted publishing
GitHub Actions: Publish to PyPI
name: Publish Python Package
on:
release:
types: [published]
jobs:
publish:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: "3.11"
- run: pip install build
- run: python -m build
- uses: pypa/gh-action-pypi-publish@release/v1
Requirements:
- pyproject.toml configured
- Trusted Publishing enabled in PyPI
npm Publishing (JavaScript)
Best practice:
- Use semantic-release
- Publish only from main branch
GitHub Actions: Publish to npm
name: Publish npm Package
on:
push:
branches:
- main
jobs:
publish:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: "20"
registry-url: https://registry.npmjs.org
- run: npm ci
- run: npx semantic-release
env:
NPM_TOKEN: ${{ secrets.NPM_TOKEN }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
GitHub Labels Taxonomy
A consistent label system improves triage and contributor onboarding.
Type Labels
- bug
- enhancement
- documentation
- refactor
- security
- question
Priority Labels
- priority: critical
- priority: high
- priority: medium
- priority: low
Status Labels
- status: triage
- status: blocked
- status: in progress
- status: ready for review
Scope / Stack Labels
- python
- javascript
- frontend
- backend
- api
- ci
Community Labels
- good first issue
- help wanted
- breaking change
Maintainers and Governance Model
Clear governance improves trust and sustainability.
Roles
- Maintainers
- Own project direction
- Review and merge PRs
- Manage releases
- Contributors
- Submit issues and PRs
- Improve code and documentation
Decision Making
- Decisions are made publicly in issues or PRs
- Maintainers aim for consensus
- Maintainer vote is final when consensus cannot be reached
Becoming a Maintainer
- Consistent high-quality contributions
- Community engagement
- Invitation by existing maintainers
Governance File
Recommended file:
- GOVERNANCE.md
Complete Open Source Starter Repository Structure
Recommended structure for a Python + JavaScript open source project:
project-root/
├── backend/
│ ├── app/
│ │ ├── __init__.py
│ │ ├── main.py
│ │ ├── api/
│ │ └── services/
│ ├── tests/
│ ├── pyproject.toml
│ └── README.md
│
├── frontend/
│ ├── src/
│ │ ├── components/
│ │ ├── pages/
│ │ └── services/
│ ├── tests/
│ ├── package.json
│ └── README.md
│
├── docs/
│ ├── api/
│ ├── guides/
│ └── README.md
│
├── .github/
│ ├── workflows/
│ │ ├── backend-ci.yml
│ │ ├── frontend-ci.yml
│ │ └── release.yml
│ ├── ISSUE_TEMPLATE/
│ └── PULL_REQUEST_TEMPLATE.md
│
├── CHANGELOG.md
├── CODE_OF_CONDUCT.md
├── CONTRIBUTING.md
├── GOVERNANCE.md
├── LICENSE
├── README.md
├── SECURITY.md
└── semantic-release.json
Open Source Maturity Checklist
- Automated CI per service
- Automated releases
- PyPI and npm publishing
- Clear contribution workflow
- Governance defined
- Labels and templates configured
- Security policy documented
- Code of conduct enforced
Final Notes
Well-maintained open source projects prioritize:
- Automation over manual work
- Transparency over private decisions
- Documentation over tribal knowledge
- Community over individual ownership