Thursday, January 17, 2019

Building Unicode applications with TCC in Windows

Building Unicode applications with TCC in Windows.
(too old to reply)
James Russell Moore
5 years ago
Permalink
Hello, I have a small utility which makes use of wide characters in
Windows. It's written using _TCHAR though so it can be built with multibyte
characters as well.

The main() function header is written as follows:

int _tmain(int argc, _TCHAR* argv[])

Translates to "main" in a multibyte build and "wmain" in a Unicode build,
but TCC seems to expect a main() function instead of a wmain() one so it
can't produce Unicode builds (I tried with a mob build as well).

Is is expected or is there something wrong? Would it be possible to make a
modification to enable wmain() to be an entry point as well as main()?

Attached to this email is a small test case.

Thanks for your time anyway.
YX Hao
5 years ago
Permalink
I think TCC hasn't implemented this.
...
Maybe you can take a look at:
win32\lib\crt1.c
win32\lib\wincrt1.c
Attached to this email is a small test case.
Thanks for your time anyway.
Daniel Glöckner
5 years ago
Permalink
Post by YX Hao
Post by James Russell Moore
Is is expected or is there something wrong? Would it be possible to make a
modification to enable wmain() to be an entry point as well as main()?
win32\lib\crt1.c
In crt1.c declare main and wmain as weak.
In _start call __getmainargs/main if main and __wgetmainargs/wmain
if !main.

I wonder if this works when linking to static libraries like libfl
(from GNU flex) that contain a main function.

Daniel
YX Hao
5 years ago
Permalink
Something more, on windows use a Unicode console environment is not convenient. It's not the default. You may need change the setting times, up and down.
You really want to pass any Unicode argv? Usually there are functions for wide chars can be used.
...
Carlos Montiers
5 years ago
Permalink
Look these code adapted from my current develop of bg tool (
http://consolesoft.com/p/bg ):

#define UNICODE

#include <windows.h>
#include <stdio.h>
#include <stdlib.h>
#include <conio.h>
#include <string.h>
#include <ctype.h>

#undef putchar
#undef putwchar
#undef isxdigit
#undef iswxdigit
#define putchar(c) fputc(c, stdout)
#define putwchar(c) fputwc(c, stdout)
#define isxdigit(d) _isctype(d, _HEX)
#define iswxdigit(d) iswctype(d, _HEX)

#ifdef UNICODE
#define strtoul wcstoul
#define _strupr _wcsupr
#define strcmp wcscmp
#define atol _wtol
#define _isctype iswctype
#define fputc fputwc
#define ReadConsoleInput ReadConsoleInputW
#define PlaySound PlaySoundW
#else
#define ReadConsoleInput ReadConsoleInputA
#define PlaySound PlaySoundA
#endif



typedef struct {
int newmode;
} _startupinfo;

void __wgetmainargs(int *_Argc, wchar_t *** _Argv, wchar_t *** _Env,
int _DoWildCard, _startupinfo * _StartInfo);
void __getmainargs(int *_Argc, char ***_Argv, char ***_Env,
int _DoWildCard, _startupinfo * _StartInfo);

int my_main(int argc, TCHAR* argv[]);


void _start(void)
{
int argc;
TCHAR **argv;
TCHAR **env;

int ret;

_startupinfo start_info = { 0 };
#ifdef UNICODE
__wgetmainargs(&argc, &argv, &env, 0, &start_info);
#else
__getmainargs(&argc, &argv, &env, 0, &start_info);
#endif

ret = my_main(argc, argv);
exit(ret);
}

int my_main(int argc, TCHAR* argv[])
{

return 0;

}

for printf:
these:
printf("Hello %s\n", argv[0]);
using unicode you need use replace with this:
wprintf(L"Hello %ls\n", argv[0]);
...
James Russell Moore
5 years ago
Permalink
Post by Daniel Glöckner
In crt1.c declare main and wmain as weak.
In _start call __getmainargs/main if main and __wgetmainargs/wmain
if !main.
I searched around for weak symbols because I didn't know about them,
thanks. I tried to declare them as weak but it seems not to be working (in
Windows maybe?). I tried placing the __attribute__((weak)) before the
semicolon and before the return type of the functions, in any case errors
were shown about the missing main or wmain function depending on the
setting. GCC also allows for a pragma but I think that's not implemented in
TCC.
Post by Daniel Glöckner
Something more, on windows use a Unicode console environment is not
convenient. It's not the default. You may need change the setting times, up
and down.
You really want to pass any Unicode argv? Usually there are functions for
wide chars can be used.
In a Unicode setting I know how long a character is, I can iterate through
the characters of a string in the same way as if they were simple chars. If
not using Unicode the characters may use more than 1 byte, so it's more
complicated to know their length. I could interpret char to be in UTF-8 to
be UTF-8 too I guess as in Linux, but in Windows things seem to get more
complicated with code pages and the like. Think for example about creating
a name of a file with Unicode characters or output an echo of the command
line.
Post by Daniel Glöckner
Look these code adapted from my current develop of bg tool
Thanks, that works with TCC without any modification, overriding the _start
routine works fine for Unicode builds but it'd have to be conditional in
the program to allow for interoperability with other compilers.
YX Hao
5 years ago
Permalink
Hi,
...
Try:
Findstr /s /n /r /c:"[^a-z]main[^a-z]" *.c

You will see the related codes for linking stage, together with the crt codes.
You known what linking does, you get the points.

I think it needs _start/_wstart pairs and linking process.
I searched gcc and ld codes to see how it could be made, but didn't get it.
...
So you need it.
Look these code adapted from my current develop of bg tool
Thanks, that works with TCC without any modification, overriding the _start
routine works fine for Unicode builds but it'd have to be conditional in the
program to allow for interoperability with other compilers.
Some one experienced, like grischka, may be interested to implement this capability for TCC.

Regards.

SOURCE TEXT>

https://tinycc-devel.nongnu.narkive.com/kQoPYDTv/building-unicode-applications-with-tcc-in-windows

No comments:

Post a Comment

Komentar=