File Structure Overview¶
Animius uses various files, ranging from configs to Tensorflow model checkpoints. This overview will briefly explain how and where each of the objects is stored by default when using the console (this does not apply when using Animius as a python library or when directories are provided). It should also be noted that all config files created by Animius are in JSON.
When you start Animius for the first time, the console will prompt you to choose a directory as the storage space for resources. This directory will contain all the files with the exception of
\user-config.json being at the root directory.
It should be noted that there is a clear distinction between
\config.json serves as a template for
\user-config.json, which should be created either manually or automatically at first launch. This way, when the user pulls updates from git, the configs will not be overwritten.
Waifus (or Waifu-tachi)¶
A Waifu, as the name suggests, is a set of models that are linked together to create an artificial intelligence agent — a Waifu. While waifus are stored individually as JSON files in the
waifus folder, the
\waifus.json file contains information on all waifus for console usage.
Unlike the Waifus, each model gets its own folder under the
\models\. A single file,
\models\models.json stores information of all of the models. This also allows for models being stored outside of
\Models\ (although this is strongly discouraged). For the sake of an example, we will have a single model named
\models\myModel\ will include a file
\Models\myModel\myModel.json containing its config. Model checkpoints and graphs will be stored in the same folder as
myModel.json by default, while Tensorboard files will be stored in a folder defined by the user in model config.
Model configs, located under
\model_configs\ have very similar structures as waifus: individual user-named JSON configs and a main
model_configs.json. Each model config is an individual JSON file that is very similar to the config file of models, with the few exceptions of model-specific values such as names and epochs.
It is often encouraged to store data in their raw forms, such as audio or words, so they can always be regenerated after changes in model structure. Nevertheless, it may be more convenient to store parsed data as numpy arrays. The saved files will be found in the
\Data\ folder within their individual folders indicated in
\data\data.json. Each folder will contain a JSON config and a
.npz file generated by numpy.
When storing data, there are also the options of saving separate copies of the model config and word embedding. This is to prevent changes that may be incompatible with the parsed data, such as sequence lengths and token indexes. These individual copies will be stored in folders within the directory of the JSON config and the
Embeddings are stored under the
\embeddings\ folder with a
\embeddings\embeddings.json config. Each embedding has an individual folder containing a
.npy numpy ndarrary file and two
.pkl pickle files. No individual JSON file is stored for embeddings.
File tree example¶
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
Resources ├─── waifus │ ├─── waifus.json │ └─── yukino_waifu_config.json ├─── models │ ├─── models.json │ └─── myModel │ ├─── checkpoint │ ├─── myModel.json │ ├─── myModel_graph.pb │ ├─── myModel-0.data-00000 │ └─── myModel-0.meta ├─── model_configs │ ├─── model_configs.json │ └─── myModelConfig.json ├─── data │ ├─── data.json │ └─── myData │ ├─── myData.json │ └─── myData_np_arrays.npz └─── embeddings ├─── embeddings.json └─── myEmbedding ├─── myEmbedding.npy ├─── myEmbedding_words.pkl_ └─── myEmbedding_words_to_index.pkl