Skip to content

Commit 2908a53

Browse files
authored
Merge pull request #97 from CompVis/scene-images-coco
Added scene image generation for COCO 🌆
2 parents 141eb74 + 6194bd1 commit 2908a53

File tree

221 files changed

+3794
-10
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

221 files changed

+3794
-10
lines changed

assets/coco_scene_images_training.svg

Lines changed: 2574 additions & 0 deletions

configs/coco_cond_stage.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ model:
3030
codebook_weight: 1.0
3131

3232
data:
33-
target: cutlit.DataModuleFromConfig
33+
target: main.DataModuleFromConfig
3434
params:
3535
batch_size: 12
3636
train:
@@ -41,7 +41,7 @@ data:
4141
onehot_segmentation: true
4242
use_stuffthing: true
4343
validation:
44-
target: taming.data.coco.CocoImagesAndCaptionsTrain
44+
target: taming.data.coco.CocoImagesAndCaptionsValidation
4545
params:
4646
size: 256
4747
crop_size: 256
Lines changed: 81 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,81 @@
1+
model:
2+
base_learning_rate: 4.5e-06
3+
target: taming.models.cond_transformer.Net2NetTransformer
4+
params:
5+
cond_stage_key: objects_bbox
6+
transformer_config:
7+
target: taming.modules.transformer.mingpt.GPT
8+
params:
9+
vocab_size: 8192
10+
block_size: 348 # = 256 + 92 = dim(vqgan_latent_space,16x16) + dim(conditional_builder.embedding_dim)
11+
n_layer: 40
12+
n_head: 16
13+
n_embd: 1408
14+
embd_pdrop: 0.1
15+
resid_pdrop: 0.1
16+
attn_pdrop: 0.1
17+
first_stage_config:
18+
target: taming.models.vqgan.VQModel
19+
params:
20+
ckpt_path: /path/to/coco_epoch117.ckpt # https://heibox.uni-heidelberg.de/f/78dea9589974474c97c1/
21+
embed_dim: 256
22+
n_embed: 8192
23+
ddconfig:
24+
double_z: false
25+
z_channels: 256
26+
resolution: 256
27+
in_channels: 3
28+
out_ch: 3
29+
ch: 128
30+
ch_mult:
31+
- 1
32+
- 1
33+
- 2
34+
- 2
35+
- 4
36+
num_res_blocks: 2
37+
attn_resolutions:
38+
- 16
39+
dropout: 0.0
40+
lossconfig:
41+
target: taming.modules.losses.DummyLoss
42+
cond_stage_config:
43+
target: taming.models.dummy_cond_stage.DummyCondStage
44+
params:
45+
conditional_key: objects_bbox
46+
47+
data:
48+
target: main.DataModuleFromConfig
49+
params:
50+
batch_size: 6
51+
num_workers: 12
52+
train:
53+
target: taming.data.annotated_objects_coco.AnnotatedObjectsCoco
54+
params:
55+
data_path: data/coco_annotations_100
56+
split: train
57+
keys: [image, objects_bbox, file_name]
58+
no_tokens: 8192
59+
target_image_size: 256
60+
min_object_area: 0.00001
61+
min_objects_per_image: 2
62+
max_objects_per_image: 30
63+
crop_method: random-1d
64+
random_flip: true
65+
use_group_parameter: true
66+
encode_crop: true
67+
validation:
68+
target: taming.data.annotated_objects_coco.AnnotatedObjectsCoco
69+
params:
70+
data_path: data/coco_annotations_100
71+
split: validation
72+
keys: [image, objects_bbox, file_name]
73+
no_tokens: 8192
74+
target_image_size: 256
75+
min_object_area: 0.00001
76+
min_objects_per_image: 2
77+
max_objects_per_image: 30
78+
crop_method: center
79+
random_flip: false
80+
use_group_parameter: true
81+
encode_crop: true

data/coco_annotations_100/annotations/instances_train2017.json

Lines changed: 1 addition & 0 deletions
Large diffs are not rendered by default.

data/coco_annotations_100/annotations/instances_val2017.json

Lines changed: 1 addition & 0 deletions
Large diffs are not rendered by default.

data/coco_annotations_100/annotations/stuff_train2017.json

Lines changed: 1 addition & 0 deletions
Large diffs are not rendered by default.

data/coco_annotations_100/annotations/stuff_val2017.json

Lines changed: 1 addition & 0 deletions
Large diffs are not rendered by default.
146 KB
60.8 KB
80.4 KB

0 commit comments

Comments
 (0)