Skip to content

Comments

Add ViT models, HSI Hang2020, inference API, and taxonomic level classification#13

Open
Ritesh313 wants to merge 5 commits intoGatorSense:mainfrom
Ritesh313:main
Open

Add ViT models, HSI Hang2020, inference API, and taxonomic level classification#13
Ritesh313 wants to merge 5 commits intoGatorSense:mainfrom
Ritesh313:main

Conversation

@Ritesh313
Copy link
Member

Summary

This PR adds two major feature sets to the NeonTreeClassification pipeline: expanded model architectures and a complete inference API with taxonomic level support.


Commit 1 — New model architectures & DeepForest compatibility groundwork

Breaking changes:

  • Default RGB image size changed from 128×128 → 224×224
  • Default RGB normalization changed from 0_1imagenet
    • For backward compatibility, pass rgb_size=(128, 128) and rgb_norm_method='0_1' explicitly

New features:

  • Vision Transformer (ViT) support: vit_b_16, vit_b_32, vit_l_16, vit_l_32
  • Hang2020 dual-pathway attention architecture for HSI classification
  • model_variant CLI parameter for architecture selection
  • Preliminary DeepForest CropModel compatibility methods (normalize(), label_dict persistence, set_label_dict(), get_label_dict()) — WIP
  • Multi-output training with auxiliary losses (Hang2020)
  • HuggingFace upload script (experimental)
  • Better SLURM experiment naming to prevent array job collisions

Commit 2 — Inference API & taxonomic level classification

New features:

  • Complete neon_tree_classification.inference module with TreeClassifier class
    • TreeClassifier.from_checkpoint() for easy model loading
    • predict() with batch support and image preprocessing pipeline
  • Species-level (167 classes) and genus-level (60 classes) classification
  • taxonomic_level parameter in DataModule ('species' or 'genus')
  • WeightedRandomSampler for class balancing
  • Label mapping JSON files for both species and genus
  • External test set with species overlap filtering

Documentation:

  • New taxonomic_levels.md — comprehensive guide to taxonomic level training and inference
  • Updated training.md and train.py

No breaking changes in this commit.


Testing

  • test_inference.py covers TreeClassifier loading and prediction
  • create_label_mappings.py for label validation

Major addition:
- Complete inference API for loading models and making predictions
- Support for species-level (167 classes) and genus-level (60 classes) classification
- TreeClassifier class with from_checkpoint() and predict() methods
- Label mapping system with JSON metadata files
- Image preprocessing pipeline for various input formats

Core enhancements:
- DataModule now supports taxonomic_level parameter ('species' or 'genus')
- Genus extraction via species_name.split()[0] for 60-class classification
- WeightedRandomSampler support for class balancing
- External test set with species overlap filtering

Documentation:
- Comprehensive docs/taxonomic_levels.md guide (314 lines)
- Label inspection script for validation
- Test scripts for inference verification
- Examples of progressive training (genus → species)

Files added:
- neon_tree_classification/inference/ (complete module)
- docs/taxonomic_levels.md
- scripts/create_label_mappings.py
- scripts/test_inference.py
- processing/misc/inspect_labels.py

Modified:
- neon_tree_classification/core/datamodule.py (+163 lines)
- neon_tree_classification/core/dataset.py (+77 lines)
- examples/train.py (+21 lines)
- docs/training.md (+18 lines)

This enables:
1. Quick model deployment with TreeClassifier.from_checkpoint()
2. Flexible training at species or genus level
3. Production-ready inference with batch prediction
4. Label mapping files for HuggingFace upload

Breaking changes: None (backward compatible)
…ation

BREAKING CHANGES:
- Default RGB image size changed from 128x128 to 224x224
- Default RGB normalization changed from 0_1 to imagenet
  For backward compatibility, explicitly pass rgb_size=(128, 128) and rgb_norm_method='0_1'

New Features:
- Add Vision Transformer (ViT) support: vit_b_16, vit_b_32, vit_l_16, vit_l_32
- Implement Hang2020 dual-pathway attention architecture for HSI classification
- Add model_variant parameter to training script for architecture selection
- Add preliminary DeepForest CropModel compatibility methods (WIP):
  * normalize() method for transforms
  * label_dict persistence in checkpoints
  * set_label_dict() and get_label_dict() helpers
- Add HuggingFace upload script (experimental, needs further testing)
- Add multi-output training support with auxiliary losses (Hang2020)

Improvements:
- Better experiment naming to prevent collisions in SLURM array jobs
- Enhanced test logging with detailed statistics
- Add rgb_size and rgb_norm_method CLI arguments for flexibility
- Update README with project roadmap

Note: Full DeepForest CropModel integration and HuggingFace loading
are still in progress and may require additional work.

Files changed: 10 files
- Added: scripts/upload_to_huggingface.py, sample_plots/test_PSMEM_douglas_fir.png
- Modified: train.py, rgb_models.py, hsi_models.py, lightning_modules.py,
            dataset.py, datamodule.py, README.md, visualization.ipynb
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR expands NeonTreeClassification with additional model architectures (ViT for RGB and Hang2020 for HSI), adds an inference API (TreeClassifier) with taxonomic-level (species/genus) support, and updates training/docs/scripts to support the new workflows.

Changes:

  • Added ViT RGB model variants and Hang2020 HSI architecture with multi-output (aux loss) training support.
  • Introduced neon_tree_classification.inference module (preprocessing, label mappings, registry, predictor API) plus scripts for label mapping creation and HF uploads.
  • Added genus-level training support in the DataModule (label mapping + optional balanced sampling) and updated docs/examples accordingly.

Reviewed changes

Copilot reviewed 21 out of 23 changed files in this pull request and generated 14 comments.

Show a summary per file
File Description
scripts/upload_to_huggingface.py New utility to convert/upload checkpoints to HF; writes config/model card and safetensors.
scripts/test_inference.py New manual inference “test” script for running predictions from HDF5 samples.
scripts/create_label_mappings.py New script to generate species/genus label mapping JSONs for inference.
sample_plots/test_PSMEM_douglas_fir.png Adds an example plot asset.
processing/misc/inspect_labels.py Adds a label/genus inspection script to validate genus extraction assumptions.
notebooks/visualization.ipynb Updates notebook outputs/cell execution metadata and sample CSV path.
neon_tree_classification/models/rgb_models.py Adds ViT-based RGB classifier and extends RGB model factory.
neon_tree_classification/models/lightning_modules.py Adds DeepForest compatibility helpers; adds multi-output loss support for HSI classifier.
neon_tree_classification/models/hsi_models.py Implements Hang2020 dual-pathway attention architecture and adds it to the factory.
neon_tree_classification/inference/utils.py Adds utilities for label loading, prediction formatting, checkpoint metadata extraction, etc.
neon_tree_classification/inference/preprocessing.py Adds image loading/resizing/normalization/tensor preparation pipeline for inference.
neon_tree_classification/inference/predictor.py Adds TreeClassifier high-level inference API.
neon_tree_classification/inference/model_registry.py Adds a local registry for known pretrained model metadata and label mapping resolution.
neon_tree_classification/inference/label_mappings/species_labels.json Adds committed species label mapping used by inference.
neon_tree_classification/inference/label_mappings/genus_labels.json Adds committed genus label mapping used by inference.
neon_tree_classification/inference/init.py Exposes inference API/public functions via package imports.
neon_tree_classification/core/dataset.py Updates default RGB preprocessing (224 + ImageNet) and adds genus mapping support in label validation/lookup.
neon_tree_classification/core/datamodule.py Adds taxonomic_level and optional WeightedRandomSampler balanced sampling; genus label mapping creation.
examples/train.py Adds CLI args for model variant, taxonomic level, RGB size/norm; passes idx_to_label for checkpoint metadata.
docs/training.md Updates baseline result table and notes for new configs.
docs/taxonomic_levels.md New guide for genus/species training and filtering guidance.
README.md Updates project status lines and notes.
.gitignore Adds an extra ignored markdown file entry.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- Remove sys.path manipulation from predictor.py (use package imports directly)
- Remove unused OrderedDict import in create_label_mappings.py
- Update preprocessing defaults to 224x224 and imagenet normalization
- Update preprocess_image_batch and resize_image defaults to match
- Fix normalize_rgb docstring to accurately describe both normalization modes
- Update model_registry input_size to 224x224 and add norm_method field
- Make TreeClassifier norm_method configurable (default: imagenet)
- Fix predictor to use self.norm_method instead of hardcoded '0_1'
- Update from_checkpoint to use (224, 224) and imagenet defaults
- Add rgb_norm_method param to RGBClassifier; normalize() now reflects it
- Validate numeric_to_label_dict in upload_to_huggingface.py
- Fix docs: species_filter is inclusion filter, not exclusion
- Fix docs: add --csv_path to inspect_labels.py example commands
- Fix warning message: clarify species_filter is an inclusion filter
@Ritesh313 Ritesh313 requested a review from Copilot February 18, 2026 14:56
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 23 out of 25 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants