在MinGW下使用boost :: filestream的UTF-8名称

我遇到了boost文件流的问题：我需要在windows下的用户目录中创建和修改文件。然而，用户名包含一个变音符号，它在MinGW下编译时会失败，因为标准缺少boost_使用的文件流的wide_char open（）API。请参阅Read/Write file with unicode file name with plain C++/Boost,UTF-8-compliant IOstreams和https://svn.boost.org/trac10/ticket/9968 在MinGW下使用boost :: filestream的UTF-8名称

但是我碰到了这个问题，这个问题主要发生在尝试使用系统代码页之外的字符时。在我的情况下，我只使用系统代码页中的字符，因为用户目录显然存在。这让我觉得，这应该工作，如果我能告诉的boost ::路径期望所有std::string S作为beeing UTF8但在调用string()成员函数（其中发生在boost::fstream::open）

所以当把它们转换成系统编码基本上：有没有办法使用boost（和boost locale）自动地进行转换（UTF8->系统编码）？

是完整的，这里是我设置的区域代码：

#ifdef _WIN32 
     // On windows we want to enforce the encoding (mostly UTF8). Also using "" would use the default which uses "wrong" separators 
     std::locale::global(boost::locale::generator().generate("C")); 
#else 
     // In linux/OSX this suffices 
     std::locale::global(std::locale::classic()); 
#endif // _WIN32 
     // Use also the encoding (mostly UTF8) for bfs paths 
     bfs::path::imbue(std::locale());

来源

2017-09-25 Flamefire

我发现使用其他的图书馆，都有自己的缺点2级的解决方案。

Pathie（Docu）它看起来像一个完全替代的boost ::文件系统提供UTF8知道流和路径处理以及创建符号链接和其他文件/文件夹操作。真正的酷是内置的支持获得特殊的目录（温度，家庭，程序文件夹等）
缺点：只作为动态库，因为静态构建有错误。如果你已经使用boost，也可能会矫枉过正。
Boost.NoWide（Docu）提供几乎所有文件和流处理程序的替代方法，以在Windows上支持UTF8，并回退到其他标准函数。文件流接受UTF8编码的值（用于名称），它使用自身的提升。
缺点：没有路径处理，也不接受bfs::path或宽字符串（bfs::path Windows上的内部格式为UTF16），因此需要修补程序，虽然它很简单。如果你想使用std::cout等UTF8字符串（是直接工作！）
另一个很酷的事情：它提供了一个类，以在Windows上将argc/argv转换为UTF8。

来源

2017-09-27 21:55:05 Flamefire

这是Windows上的问题，因为Windows使用UTF-16，而不是UTF-8。我经常使用这个功能来解决你的问题非常（我已经去掉了几件事情要在这里发布）

// get_filename_token.cpp 

// Turns a UTF-8 filename into something you can pass to fstream::open() on 
// Windows. Returns the argument on other systems. 

// Copyright 2013 Michael Thomas Greer 
// Distributed under the Boost Software License, Version 1.0. 
// (See accompanying file LICENSE_1_0.txt 
// or copy at   http://www.boost.org/LICENSE_1_0.txt) 

#ifdef _WIN32 

#include <string> 

#ifndef NOMINMAX 
#define NOMINMAX 
#endif 
#include <windows.h> 

std::string get_filename_token(const std::string& filename) 
    { 
    // Convert the UTF-8 argument path to a Windows-friendly UTF-16 path 
    wchar_t* widepath = new wchar_t[ filename.length() + 1 ]; 
    MultiByteToWideChar(CP_UTF8, 0, filename.c_str(), -1, widepath, filename.length() + 1); 

    // Now get the 8.5 version of the name 
    DWORD n = GetShortPathNameW(widepath, NULL, 0); 
    wchar_t* shortpath = new wchar_t[ n ]; 
    GetShortPathNameW(widepath, shortpath, n); 

    // Convert the short version back to a C++-friendly char version 
    n = WideCharToMultiByte(CP_UTF8, 0, shortpath, -1, NULL, 0, NULL, NULL); 
    char* ansipath = new char[ n ]; 
    WideCharToMultiByte(CP_UTF8, 0, shortpath, -1, ansipath, n, NULL, NULL); 

    std::string result(ansipath); 

    delete [] ansipath; 
    delete [] shortpath; 
    delete [] widepath; 

    return result; 
    } 

#else 

std::string get_filename_token(const std::string& filename) 
    { 
    // For all other systems, just return the argument UTF-8 string. 
    return filename; 
    } 

#endif

来源

2017-09-25 03:50:07

是的，我猜我需要这样下去。我甚至想创建一个新的iofstream类，就像boost类一样提供新的开放函数和ctors。你为什么要转换回UTF8？ CP_ACP不会更好吗？为什么助推不会这样，因为这看起来很简单。像8.3这样的名称并不总是在ANSI或者ANSI之类的缺点？ – Flamefire

发现退缩：这只适用于现有文件。所以需要确保文件确实存在，这可能会在尝试使用widechar实现创建并使用短路径方式打开时为竞争条件打开大门。 – Flamefire

因为它是跨平台的。所有现代* nixen将打开一个UTF-8文件名，并且不会破坏旧代码。同样，_all_ Windows文件名可以转换为OS可接受的8.3文件名“token”，这在技术上是UTF-8子集。 –

在MinGW下使用boost :: filestream的UTF-8名称

回答

相关问题