Add GPU support option : see https://github.com/glucauze/sd-webui-faceswaplab/pull/24
# 1.2.0 :
This version changes quite a few things.
@ -18,10 +22,14 @@ Bug fixes :
In terms of the API, it is now possible to create a remote checkpoint and use it in units. See the example in client_api or the tests in the tests directory.
See https://github.com/glucauze/sd-webui-faceswaplab/pull/19
# 1.1.2 :
+ Switch face checkpoint format from pkl to safetensors
See https://github.com/glucauze/sd-webui-faceswaplab/pull/4
By default, these settings are disabled, but you can use the global settings to modify the default behavior. These options are called "Default Upscaled swapper..."
@ -59,13 +72,13 @@ The purpose of this feature is to enhance the quality of the face in the final i
The upscaled inswapper is disabled by default. It can be enabled in the sd options. Understanding the various steps helps explain why results may be unsatisfactory and how to address this issue.
+ **upscaler** : LDSR if None. The LDSR option generally gives the best results but at the expense of a lot of computational time. You should test other models to form an opinion. The 003_realSR_BSRGAN_DFOWMFC_s64w8_SwinIR-L_x4_GAN model seems to give good results in a reasonable amount of time. It's not possible to disable upscaling, but it is possible to choose LANCZOS for speed if Codeformer is enabled in the upscaled inswapper. The result is generally satisfactory.
+ **upscaler** : LDSR if None. The LDSR option generally gives the best results but at the expense of a lot of computational time. You should test other models to form an opinion. The [003_realSR_BSRGAN_DFOWMFC_s64w8_SwinIR-L_x4_GAN](https://github.com/JingyunLiang/SwinIR/releases/download/v0.0/003_realSR_BSRGAN_DFOWMFC_s64w8_SwinIR-L_x4_GAN.pth) model seems to give good results in a reasonable amount of time. It's not possible to disable upscaling, but it is possible to choose LANCZOS for speed if Codeformer is enabled in the upscaled inswapper. The result is generally satisfactory. You can check [here for an upscaler database](https://upscale.wiki/wiki/Model_Database) and [here for some comparison](https://phhofm.github.io/upscale/favorites.html). It is a test and try process.
+ **restorer** : The face restorer to be used if necessary. Codeformer generally gives good results.
+ **sharpening** can provide more natural results, but it may also add artifacts. The same goes for **color correction**. By default, these options are set to False.
+ **improved mask:** The segmentation mask for the upscaled swapper is designed to avoid the square mask and prevent degradation of the non-face parts of the image. It is based on the Codeformer implementation. If "Use improved segmented mask (use pastenet to mask only the face)" and "upscaled inswapper" are checked in the settings, the mask will only cover the face, and will not be squared. However, depending on the image, this might introduce different types of problems such as artifacts on the border of the face.
+ **erosion factor:** it is possible to adjust the mask erosion parameters using the erosion settings. The higher this setting is, the more the mask is reduced.
#### Post-Inpainting :
### Post-Inpainting
This part is applied AFTER face swapping and only on matching faces.
@ -122,7 +135,7 @@ The checkpoint can then be used in the main interface (use refresh button)
## Processing order:
## Processing order
The extension is activated after all other extensions have been processed. During the execution, several steps take place.
@ -157,6 +170,42 @@ The API is documented in the FaceSwapLab tags in the http://localhost:7860/docs
You don't have to use the api_utils.py file and pydantic types, but it can save time.
## Experimental GPU support
You need a sufficiently recent version of your SD environment. Using the GPU has a lot of little drawbacks to understand, but the performance gain is substantial.
In Version 1.2.1, the ability to use the GPU has been added, a setting that can be configured in SD at startup. Currently, this feature is only supported on Windows and Linux, as the necessary dependencies for Mac have not been included.
The `--faceswaplab_gpu` option in SD can be added to the args in webui-user.sh or webui-user.bat. **There is also an option in SD settings**.
The model stays loaded in VRAM and won't be unloaded after each use. As of now, I don't know a straightforward way to handle this, so it will occupy space continuously. If your system's VRAM is limited, enabling this option might not be advisable.
A change has also been made that could lead to some ripple effects. Previously, detection parameters such as det_size and det_thresh were automatically adjusted when a second model was loaded. This is no longer possible, so these parameters have been moved to the global settings to enable face detection.
The `auto_det_size` option emulates the old behavior. It has no difference on CPU. BUT it will load the model twice if you use GPU. That means more VRAM comsumption and twice the initial load time. If you don't want that, you can use a det_size of 320, read below.
If you enabled GPU and you are sure you avec a CUDA compatible card and the model keep using CPU provider, please checks that you have onnxruntime-gpu installed.
### SD.NEXT and GPU
Please read carefully.
Using the GPU requires the use of the onnxruntime-gpu>=1.15.0 dependency. For the moment, this conflicts with older SD.Next dependencies (tensorflow, which uses numpy and potentially rembg). You will need to check numpy>=1.24.2 and tensorflow>=2.13.0.
You should therefore be able to debug a little before activating the option. If you don't feel up to it, it's best not to use it.
The first time the swap is used, the program will continue to use the CPU, but will offer to install the GPU. You will then need to restart. This is due to the optimizations made by SD.Next to the installation scripts.
For SD.Next, the best is to install dependencies manually :
on windows :
```shell
.\venv\Scripts\activate
cd .\extensions\sd-webui-faceswaplab\
pip install .\requirements-gpu.txt
```
## Settings
You can change the program's default behavior in your webui's global settings (FaceSwapLab section in settings). This is particularly useful if you want to have default options for inpainting or for post-processsing, for example.
@ -164,3 +213,13 @@ You can change the program's default behavior in your webui's global settings (F
The interface must be restarted to take the changes into account. Sometimes you have to reboot the entire webui server.
There may be display bugs on some radio buttons that may not display the value (Codeformer might look disabled for instance). Check the logs to ensure that the transformation has been applied.
### det_size and det_thresh (detection accuracy and performances)
V1.2.1 : A change has been made that could lead to some ripple effects. Previously, detection parameters such as det_size and det_thresh were automatically adjusted when a second model was loaded. This is no longer possible, so these parameters have been moved to the global settings to enable face detection.
The `auto_det_size` option emulates the old behavior. It has no difference on CPU. BUT it will load the model twice if you use GPU. That means more VRAM comsumption and twice the initial load time. If you don't want that, you can use a det_size of 320, read below.
The `det_size` parameter defines the size of the detection area, controlling the spatial resolution at which faces are detected within an image. A larger detection size might capture more facial details, enhancing accuracy but potentially impacting processing speed. Conversely, the `det_thresh` parameter represents the detection threshold, serving as a sensitivity control for face detection. A higher threshold value leads to more conservative detection, capturing only the most prominent faces, while a lower threshold might detect more faces but could also result in more false positives.
It has been observed that a det_size value of 320 is more effective at detecting large faces. If there are issues with detecting large faces, switching to this value is recommended, though it might result in a loss of some quality.
Our issue tracker often contains requests that may originate from a misunderstanding of the software's functionality. We aim to address these queries; however, due to time constraints, we may not be able to respond to each request individually. This FAQ section serves as a preliminary source of information for commonly raised concerns. We recommend reviewing these before submitting an issue.
@ -71,6 +72,16 @@ The quality of results is inherently tied to the capabilities of the model and c
Consider this extension as a low-cost alternative to more sophisticated tools like Lora, or as an addition to such tools. It's important to **maintain realistic expectations of the results** provided by this extension.
#### Why is a face not detected?
Face detection might be influenced by various factors and settings, particularly the det_size and det_thresh parameters. Here's how these could affect detection:
+ Detection Size (det_size): If the detection size is set too small, it may not capture large faces adequately. A value of 320 has been found to be more effective for detecting large faces, though it might result in a loss of some quality.
+ Detection Threshold (det_thresh): If the threshold is set too high, it can make the detection more conservative, capturing only the most prominent faces. A lower threshold might detect more faces but could also result in more false positives.
If a face is not being detected, adjusting these parameters might solve the issue. Try increasing the det_size if large faces are the problem, or experiment with different det_thresh values to find the balance that works best for your specific case.
#### Issue: Incorrect Gender Detection
@ -78,11 +89,7 @@ The gender detection functionality is handled by the underlying analysis model.
#### Why isn't GPU support included?
While implementing GPU support may seem straightforward, simply requiring a modification to the onnxruntime implementation and a change in providers in the swapper, there are reasons we haven't included it as a standard option.
The primary consideration is the substantial VRAM usage of the SD models. Integrating the model on the GPU doesn't result in significant performance gains with the current state of the software. Moreover, the GPU support becomes truly beneficial when processing large numbers of frames or video. However, our experience indicates that this tends to cause more issues than it resolves.
Consequently, requests for GPU support as a standard feature will not be considered.
GPU is supported via an option see [documentation](../doc/). This is expermental, use it carefully.
#### What is the 'Upscaled Inswapper' Option in SD FaceSwapLab?
The extension runs mainly on the CPU to avoid the use of VRAM. However, it is recommended to follow the specifications recommended by sd/a1111 with regard to prerequisites. At the time of writing, a version of python lower than 11 is preferable (even if it works with python 3.11, model loading and performance may fall short of expectations).
Older versions of gradio don’t work well with the extension. See this bug report : https://github.com/glucauze/sd-webui-faceswaplab/issues/5. It has been tested on 3.32.0
### Windows-User : Visual Studio ! Don't neglect this !
Before beginning the installation process, if you are using Windows, you need to install this requirement:
@ -18,6 +20,12 @@ Before beginning the installation process, if you are using Windows, you need to
3. OR if you don't want to install either the full Visual Studio suite or the VS C++ Build Tools: Follow the instructions provided in section VIII of the documentation.
## SD.Next / Vladmantic
SD.Next loading optimizations in relation to extension installation scripts can sometimes cause problems. This is particularly the case if you copy the script without installing it via the interface.
If you get an error after startup, try restarting the server.
"Use GPU, only for CUDA on Windows/Linux - experimental and risky, can messed up dependencies (requires restart)",
gr.Checkbox,
{"interactive":True},
section=section,
),
)
shared.opts.add_option(
"faceswaplab_keep_original",
shared.OptionInfo(
@ -37,11 +47,33 @@ def on_ui_settings() -> None:
),
)
shared.opts.add_option(
"faceswaplab_det_size",
shared.OptionInfo(
640,
"det_size : Size of the detection area for face analysis. Higher values may improve quality but reduce speed. Low value may improve detection of very large face.",
gr.Slider,
{"minimum":320,"maximum":640,"step":320},
section=section,
),
)
shared.opts.add_option(
"faceswaplab_auto_det_size",
shared.OptionInfo(
True,
"Auto det_size : Will load model twice and test faces on each if needed (old behaviour). Takes more VRAM. Precedence over fixed det_size",
# If no faces are detected print a warning to user about change in detection
else:
ifdet_size[0]>320:
logger.warning(
"No faces detected, you might want to play with det_size by reducing it (in sd global settings). Lower (320) means more detection but less precise. Or activate auto-det-size."
)
try:
# Sort the detected faces based on their x-coordinate of the bounding box