본문 바로가기

Data-science/deep learning

[StyleGan2-ada 실습] AFHQ 데이터 셋 이용해서 stylegan2-ada 학습하기 1

728x90

데이터 셋 준비

AFHQ는 동물 데이터이다. (개, 고양이, 야생동물 각 5000 정도)

github.com/clovaai/stargan-v2/blob/master/README.md#animal-faces-hq-dataset-afhq

 

clovaai/stargan-v2

StarGAN v2 - Official PyTorch Implementation (CVPR 2020) - clovaai/stargan-v2

github.com

위 사이트에서 데이터 셋을 다운로드한다.

stylegan2-ada를 학습하기 위해선, tf-records 형태로 데이터 셋을 변환해줘야 한다.

github.com/NVlabs/stylegan2-ada

 

NVlabs/stylegan2-ada

StyleGAN2 with adaptive discriminator augmentation (ADA) - Official TensorFlow implementation - NVlabs/stylegan2-ada

github.com

위 사이트에서 code를 clone 한 뒤

아래 명령어를 입력한다.

# python dataset_tool.py create_from_images 생성할경로 카테고리별_이미지경로

python dataset_tool.py create_from_images ~/datasets/afhqcat ~/downloads/afhq/train/cat
python dataset_tool.py create_from_images ~/datasets/afhqdog ~/downloads/afhq/train/dog
python dataset_tool.py create_from_images ~/datasets/afhqwild ~/downloads/afhq/train/wild 
python dataset_tool.py display ~/datasets/afhqcat

그러면 위와 같이 tfrecords 형식으로 변환하는 작업이 진행된다.

tfrecords파일이 생성됐다.

 

위 코드는 argparse 기반이라, jupyter lab실행을 위해 아래 과정을 추가한다.

afhq 논문과 같은 조건은 아래와 같은데,

main 함수 안의 args를 좀 수정해줘야 한다.

import easydict 
args = easydict.EasyDict({ "outdir": './output', "data": '../data/afhqdog', 
						   "mirror": True, "cfg": "paper512", "aug": 'ada'})
#---------------------------------------------------------------------------- 

import sys;

sys.argv=['']; 

del sys

main()

#----------------------------------------------------------------------------

github.com/spyder-ide/spyder/issues/3883#issuecomment-269131039

 

Error when executing argparse in IPython console · Issue #3883 · spyder-ide/spyder

The following code snippet fails when it is executed with F9 or in cell mode. import argparse parser = argparse.ArgumentParser() args = parser.parse_args() The reason is the arguments passed to the...

github.com

메모리 에러 핸들링

stackoverflow.com/questions/57507832/unable-to-allocate-array-with-shape-and-data-type

 

Unable to allocate array with shape and data type

I'm facing an issue with allocating huge arrays in numpy on Ubuntu 18 while not facing the same issue on MacOS. I am trying to allocate memory for a numpy array with shape (156816, 36, 53806) with...

stackoverflow.com

np.zeros([1<<30, 0]) 이 부분이 메모리 에러를 발생시킨다.

터미널에서 아래 명령어를 치면 0으로 나올 것인데

$ cat /proc/sys/vm/overcommit_memory
0

root 계정으로 아래와 같이 입력하면 위 값이 1로 바뀐다.

$ echo 1 > /proc/sys/vm/overcommit_memory

그러면 np.zeros([1<<30, 0]) 이 부분이 정상 작동한다.

이게 작동하는 이유는 위 명령어가 시스템에 overcommit을 가능하게 하는 것이기 때문이라고 한다.

This will enable "always overcommit" mode, and you'll find that indeed the system will allow you to make the allocation no matter how large it is (within 64-bit memory addressing at least).

train을 실행하면 아래와 같은 text가 출력된다.

Training options:
{
  "G_args": {
    "func_name": "training.networks.G_main",
    "fmap_base": 16384,
    "fmap_max": 512,
    "mapping_layers": 8,
    "num_fp16_res": 4,
    "conv_clamp": 256
  },
  "D_args": {
    "func_name": "training.networks.D_main",
    "mbstd_group_size": 8,
    "fmap_base": 16384,
    "fmap_max": 512,
    "num_fp16_res": 4,
    "conv_clamp": 256
  },
  "G_opt_args": {
    "beta1": 0.0,
    "beta2": 0.99,
    "learning_rate": 0.0025
  },
  "D_opt_args": {
    "beta1": 0.0,
    "beta2": 0.99,
    "learning_rate": 0.0025
  },
  "loss_args": {
    "func_name": "training.loss.stylegan2",
    "r1_gamma": 0.5
  },
  "augment_args": {
    "class_name": "training.augment.AdaptiveAugment",
    "tune_heuristic": "rt",
    "tune_target": 0.6,
    "apply_func": "training.augment.augment_pipeline",
    "apply_args": {
      "xflip": 1,
      "rotate90": 1,
      "xint": 1,
      "scale": 1,
      "rotate": 1,
      "aniso": 1,
      "xfrac": 1,
      "brightness": 1,
      "contrast": 1,
      "lumaflip": 1,
      "hue": 1,
      "saturation": 1
    }
  },
  "num_gpus": 1,
  "image_snapshot_ticks": 50,
  "network_snapshot_ticks": 50,
  "train_dataset_args": {
    "path": "../data/afhqdog",
    "max_label_size": 0,
    "resolution": 512,
    "mirror_augment": true
  },
  "metric_arg_list": [
    {
      "name": "fid50k_full",
      "class_name": "metrics.frechet_inception_distance.FID",
      "max_reals": null,
      "num_fakes": 50000,
      "minibatch_per_gpu": 8,
      "force_dataset_args": {
        "shuffle": false,
        "max_images": null,
        "repeat": false,
        "mirror_augment": false
      }
    }
  ],
  "metric_dataset_args": {
    "path": "../data/afhqdog",
    "max_label_size": 0,
    "resolution": 512,
    "mirror_augment": true
  },
  "total_kimg": 25000,
  "minibatch_size": 64,
  "minibatch_gpu": 8,
  "G_smoothing_kimg": 20,
  "G_smoothing_rampup": null,
  "run_dir": "./output/00002-afhqdog-mirror-paper512"
}

Output directory:  ./output/00002-afhqdog-mirror-paper512
Training data:     ../data/afhqdog
Training length:   25000 kimg
Resolution:        512
Number of GPUs:    1

Creating output directory...
Loading training set...
Image shape: [3, 512, 512]
Label shape: [0]

Constructing networks...

G                             Params    OutputShape         WeightShape     
---                           ---       ---                 ---             
latents_in                    -         (?, 512)            -               
labels_in                     -         (?, 0)              -               
G_mapping/Normalize           -         (?, 512)            -               
G_mapping/Dense0              262656    (?, 512)            (512, 512)      
G_mapping/Dense1              262656    (?, 512)            (512, 512)      
G_mapping/Dense2              262656    (?, 512)            (512, 512)      
G_mapping/Dense3              262656    (?, 512)            (512, 512)      
G_mapping/Dense4              262656    (?, 512)            (512, 512)      
G_mapping/Dense5              262656    (?, 512)            (512, 512)      
G_mapping/Dense6              262656    (?, 512)            (512, 512)      
G_mapping/Dense7              262656    (?, 512)            (512, 512)      
G_mapping/Broadcast           -         (?, 16, 512)        -               
dlatent_avg                   -         (512,)              -               
Truncation/Lerp               -         (?, 16, 512)        -               
G_synthesis/4x4/Const         8192      (?, 512, 4, 4)      (1, 512, 4, 4)  
G_synthesis/4x4/Conv          2622465   (?, 512, 4, 4)      (3, 3, 512, 512)
G_synthesis/4x4/ToRGB         264195    (?, 3, 4, 4)        (1, 1, 512, 3)  
G_synthesis/8x8/Conv0_up      2622465   (?, 512, 8, 8)      (3, 3, 512, 512)
G_synthesis/8x8/Conv1         2622465   (?, 512, 8, 8)      (3, 3, 512, 512)
G_synthesis/8x8/Upsample      -         (?, 3, 8, 8)        -               
G_synthesis/8x8/ToRGB         264195    (?, 3, 8, 8)        (1, 1, 512, 3)  
G_synthesis/16x16/Conv0_up    2622465   (?, 512, 16, 16)    (3, 3, 512, 512)
G_synthesis/16x16/Conv1       2622465   (?, 512, 16, 16)    (3, 3, 512, 512)
G_synthesis/16x16/Upsample    -         (?, 3, 16, 16)      -               
G_synthesis/16x16/ToRGB       264195    (?, 3, 16, 16)      (1, 1, 512, 3)  
G_synthesis/32x32/Conv0_up    2622465   (?, 512, 32, 32)    (3, 3, 512, 512)
G_synthesis/32x32/Conv1       2622465   (?, 512, 32, 32)    (3, 3, 512, 512)
G_synthesis/32x32/Upsample    -         (?, 3, 32, 32)      -               
G_synthesis/32x32/ToRGB       264195    (?, 3, 32, 32)      (1, 1, 512, 3)  
G_synthesis/64x64/Conv0_up    2622465   (?, 512, 64, 64)    (3, 3, 512, 512)
G_synthesis/64x64/Conv1       2622465   (?, 512, 64, 64)    (3, 3, 512, 512)
G_synthesis/64x64/Upsample    -         (?, 3, 64, 64)      -               
G_synthesis/64x64/ToRGB       264195    (?, 3, 64, 64)      (1, 1, 512, 3)  
G_synthesis/128x128/Conv0_up  1442561   (?, 256, 128, 128)  (3, 3, 512, 256)
G_synthesis/128x128/Conv1     721409    (?, 256, 128, 128)  (3, 3, 256, 256)
G_synthesis/128x128/Upsample  -         (?, 3, 128, 128)    -               
G_synthesis/128x128/ToRGB     132099    (?, 3, 128, 128)    (1, 1, 256, 3)  
G_synthesis/256x256/Conv0_up  426369    (?, 128, 256, 256)  (3, 3, 256, 128)
G_synthesis/256x256/Conv1     213249    (?, 128, 256, 256)  (3, 3, 128, 128)
G_synthesis/256x256/Upsample  -         (?, 3, 256, 256)    -               
G_synthesis/256x256/ToRGB     66051     (?, 3, 256, 256)    (1, 1, 128, 3)  
G_synthesis/512x512/Conv0_up  139457    (?, 64, 512, 512)   (3, 3, 128, 64) 
G_synthesis/512x512/Conv1     69761     (?, 64, 512, 512)   (3, 3, 64, 64)  
G_synthesis/512x512/Upsample  -         (?, 3, 512, 512)    -               
G_synthesis/512x512/ToRGB     33027     (?, 3, 512, 512)    (1, 1, 64, 3)   
---                           ---       ---                 ---             
Total                         30276583                                      


D                    Params    OutputShape         WeightShape     
---                  ---       ---                 ---             
images_in            -         (?, 3, 512, 512)    -               
labels_in            -         (?, 0)              -               
512x512/FromRGB      256       (?, 64, 512, 512)   (1, 1, 3, 64)   
512x512/Conv0        36928     (?, 64, 512, 512)   (3, 3, 64, 64)  
512x512/Conv1_down   73856     (?, 128, 256, 256)  (3, 3, 64, 128) 
512x512/Skip         8192      (?, 128, 256, 256)  (1, 1, 64, 128) 
256x256/Conv0        147584    (?, 128, 256, 256)  (3, 3, 128, 128)
256x256/Conv1_down   295168    (?, 256, 128, 128)  (3, 3, 128, 256)
256x256/Skip         32768     (?, 256, 128, 128)  (1, 1, 128, 256)
128x128/Conv0        590080    (?, 256, 128, 128)  (3, 3, 256, 256)
128x128/Conv1_down   1180160   (?, 512, 64, 64)    (3, 3, 256, 512)
128x128/Skip         131072    (?, 512, 64, 64)    (1, 1, 256, 512)
64x64/Conv0          2359808   (?, 512, 64, 64)    (3, 3, 512, 512)
64x64/Conv1_down     2359808   (?, 512, 32, 32)    (3, 3, 512, 512)
64x64/Skip           262144    (?, 512, 32, 32)    (1, 1, 512, 512)
32x32/Conv0          2359808   (?, 512, 32, 32)    (3, 3, 512, 512)
32x32/Conv1_down     2359808   (?, 512, 16, 16)    (3, 3, 512, 512)
32x32/Skip           262144    (?, 512, 16, 16)    (1, 1, 512, 512)
16x16/Conv0          2359808   (?, 512, 16, 16)    (3, 3, 512, 512)
16x16/Conv1_down     2359808   (?, 512, 8, 8)      (3, 3, 512, 512)
16x16/Skip           262144    (?, 512, 8, 8)      (1, 1, 512, 512)
8x8/Conv0            2359808   (?, 512, 8, 8)      (3, 3, 512, 512)
8x8/Conv1_down       2359808   (?, 512, 4, 4)      (3, 3, 512, 512)
8x8/Skip             262144    (?, 512, 4, 4)      (1, 1, 512, 512)
4x4/MinibatchStddev  -         (?, 513, 4, 4)      -               
4x4/Conv             2364416   (?, 512, 4, 4)      (3, 3, 513, 512)
4x4/Dense0           4194816   (?, 512)            (8192, 512)     
Output               513       (?, 1)              (512, 1)        
---                  ---       ---                 ---             
Total                28982849                                      

Exporting sample images...

outdir로 지정된 곳에 보면 training_options.json 파일이 있다.

실행해보면 아래와 같이 이쁘게 학습 관련 하이퍼 파라미터들이 출력된다.

 pretrained model파일

아래에 보면 pretrained model파일이 pkl파일로 있다. 이를 활용하면 더 빠르게 학습 가능할 듯하다.

nvlabs-fi-cdn.nvidia.com/stylegan2-ada/pretrained/

 

stylegan2-ada / pretrained

 

nvlabs-fi-cdn.nvidia.com